v2.4.1 — Apache 2.0 Open Source

$ automl fit --target revenue --deploy

Raw data in.
Production model out.

AutoMLHub eats your dataset and spits out a deployed model before most teams finish arguing about hyperparameters. Zero vendor lock-in. Fully open source.

$Install in 60 Seconds Star on GitHub12.4k

automlhub — training run #2847

LIVE

DatasetIngested

filerevenue_q4.csv
rows847,293
features34
targetrevenue
nulls0 (cleaned)
taskregression

Feature importance

customer_ltv

82%

churn_score

61%

region_code

44%

product_tier

38%

Model Leaderboardsorted by F1 ↓

Model

Time

Progress

✓XGBoost_v3

0.947

2m 14s

LightGBM_v2

0.931

1m 52s

RandomForest_v1

0.918

3m 08s

CatBoost_v1

0.904

—

NeuralNet_v2

—

[14:32:07] XGBoost_v3 — converged at epoch 847 · F1=0.947
[14:32:09] CatBoost_v1 — training epoch 312/480 · ETA 1m 23s
[14:32:11] automl deploy --model XGBoost_v3 --env production
[14:32:12] ✓ deployed → api.company.io/predict · p50=14ms

47models

evaluated per run

3,200lines

of boilerplate saved

8.3×faster

than manual pipeline

// Feature Matrix

Every manual step, automated.

Six capabilities. Six terminal commands. Scroll through and watch the manual alternative become progressively more absurd.

Data Ingestion

From any source to a clean tensor in one command

terminal
$automl ingest --source revenue_q4.csv --profile

Capability

AutoMLHub

Manual Workflow

Supported formats

CSV, Parquet, JSON, Avro, SQL, S3, GCS

Write custom connectors per format (~200 LOC each)

Null handling

Auto-detect + impute (median/mode/KNN)

df.fillna() per column, 40–80 lines

Type inference

Automatic with datetime parsing

pd.to_datetime() + dtypes audit, ~60 lines

Outlier detection

IQR + Z-score, configurable policy

Custom logic per feature, 100+ lines

Schema validation

Auto-generated + drift alerts

Write Pydantic/Great Expectations schema manually

AutoMLHub

LOC1 line

Time3s

Coverage12+

Manual

LOC~400 lines

Time2–3 days

Coverage1 at a time

$pip install automlhub→

Feature Engineering

Automated transformations that would take a sprint to write

terminal
$automl features --auto --importance-threshold 0.02

Capability

AutoMLHub

Manual Workflow

Categorical encoding

Auto-select: OHE / target / ordinal by cardinality

Decide + implement per column, ~80 lines

Numeric scaling

StandardScaler / MinMax / RobustScaler auto-chosen

sklearn Pipeline boilerplate, ~40 lines

Datetime features

28 derived features from timestamp columns

Custom extraction per column, 60–120 lines

Feature interactions

Polynomial + cross-feature terms, pruned by importance

Manual specification, combinatorial explosion risk

Feature selection

SHAP-based pruning, configurable threshold

Recursive elimination or correlation matrix analysis

AutoMLHub

LOC1 flag

Time8s

Coverage94 features auto

Manual

LOC~600 lines

Time3–5 days

CoverageWhatever you thought of

$pip install automlhub→

Model Selection

Race 47 candidates. Crown the winner. Skip the argument.

terminal
$automl select --task regression --metric f1 --time-budget 30m

Capability

AutoMLHub

Manual Workflow

Candidate models

47 across tree, linear, neural, ensemble families

Whatever you have time to try (usually 3–5)

Evaluation metric

20+ metrics, multi-objective support

Hardcode metric in training loop

Cross-validation

Stratified k-fold auto-configured

StratifiedKFold boilerplate, ~50 lines

Parallelism

All candidates on available cores simultaneously

Sequential unless you write joblib/Ray logic

Ensemble creation

Stacking + blending from top-k models

Complex custom implementation, 200+ lines

AutoMLHub

LOC1 command

Time28 min

Coverage47 models

Manual

LOC~800 lines

Time1–2 weeks

Coverage3–5 models

$pip install automlhub→

Hyperparameter Tuning

Stop guessing. Let Bayesian search find the global optimum.

terminal
$automl tune --model XGBoost_v3 --trials 200 --sampler tpe

Capability

AutoMLHub

Manual Workflow

Search algorithm

Bayesian (TPE), Grid, Random, CMA-ES

GridSearchCV (exhaustive, slow) or manual guessing

Search space definition

Auto-generated from model type

Define param_grid manually, ~40 lines

Early stopping

Automatic with configurable patience

Custom callback implementation

Parallel trials

Distributed across cores/machines

Requires Dask/Ray setup, ~200 lines overhead

Result tracking

Built-in MLflow-compatible experiment log

Set up MLflow or W&B integration separately

AutoMLHub

LOC1 command

Time~2 hours

CoverageBayesian optimal

Manual

LOC~500 lines

Time3–7 days

CoverageGrid or luck

$pip install automlhub→

Deployment

From winning model to live REST endpoint: one command.

terminal
$automl deploy --model XGBoost_v3 --env production --replicas 3

Capability

AutoMLHub

Manual Workflow

Serving framework

FastAPI auto-generated with OpenAPI schema

Write Flask/FastAPI app from scratch, ~300 lines

Containerization

Dockerfile auto-generated + optimized

Write Dockerfile, manage base images

Input validation

Pydantic schema inferred from training data

Manual schema definition + validation logic

Scaling config

Replica count + resource limits one flag

Kubernetes YAML manifests, ~150 lines

Rollback

Instant rollback to any previous model version

Custom CI/CD pipeline with versioning logic

AutoMLHub

LOC1 command

Time< 4 min

CoverageREST + gRPC + batch

Manual

LOC~1,200 lines

Time2–4 days

CoverageWhatever you build

$pip install automlhub→

Monitoring

Know the moment your model starts lying to your users.

terminal
$automl monitor --model XGBoost_v3 --alert-threshold 0.05

Capability

AutoMLHub

Manual Workflow

Data drift detection

PSI + KS-test on all features, auto-baseline

Custom evidently/whylogs integration, ~300 lines

Performance tracking

Real-time F1/RMSE on labelled stream

Custom logging + metrics pipeline

Alerting

Slack / PagerDuty / webhook out-of-the-box

Custom alert logic + notification integration

Retraining trigger

Auto-trigger on drift threshold breach

Custom cron + threshold logic, ~200 lines

Explainability

SHAP values + feature drift per prediction

Separate SHAP integration, ~100 lines

AutoMLHub

LOC1 command

Time< 30s setup

CoverageAll features

Manual

LOC~900 lines

Time1–2 weeks

CoverageWhat you build

$pip install automlhub→

Ready to install

Install in 60 seconds.

One command. No config files. No cloud account. No vendor to call when pricing changes.

pip · terminal

$pip install automlhub

Requires Python 3.9+. Installs CLI + Python SDK.

Quick start — 3 commands to a deployed model
$automl ingest --source data.csv
# ingest & profile your dataset
$automl fit --target revenue --deploy
# train, select, tune, deploy
$curl api.localhost/predict -d '{"customer_ltv": 8400}'
# call your live endpoint

Star on GitHub12.4k ★

⚖️Apache 2.0Forever free

🔒No telemetryYour data stays yours

🖥️Self-hostedYour infra, your rules

⚡47 modelsPer training run

Raw data in.Production model out.

Every manual step, automated.

Data Ingestion

Feature Engineering

Model Selection

Hyperparameter Tuning

Deployment

Monitoring

Install in 60 seconds.

Raw data in.
Production model out.