// Feature Matrix
Every manual step, automated.
Six capabilities. Six terminal commands. Scroll through and watch the manual alternative become progressively more absurd.
Data Ingestion
From any source to a clean tensor in one command
terminal
$automl ingest --source revenue_q4.csv --profile
Capability
AutoMLHub
Manual Workflow
Supported formats
CSV, Parquet, JSON, Avro, SQL, S3, GCS
Write custom connectors per format (~200 LOC each)
Null handling
Auto-detect + impute (median/mode/KNN)
df.fillna() per column, 40–80 lines
Type inference
Automatic with datetime parsing
pd.to_datetime() + dtypes audit, ~60 lines
Outlier detection
IQR + Z-score, configurable policy
Custom logic per feature, 100+ lines
Schema validation
Auto-generated + drift alerts
Write Pydantic/Great Expectations schema manually
AutoMLHub
LOC1 line
Time3s
Coverage12+
Manual
LOC~400 lines
Time2–3 days
Coverage1 at a time
Feature Engineering
Automated transformations that would take a sprint to write
terminal
$automl features --auto --importance-threshold 0.02
Capability
AutoMLHub
Manual Workflow
Categorical encoding
Auto-select: OHE / target / ordinal by cardinality
Decide + implement per column, ~80 lines
Numeric scaling
StandardScaler / MinMax / RobustScaler auto-chosen
sklearn Pipeline boilerplate, ~40 lines
Datetime features
28 derived features from timestamp columns
Custom extraction per column, 60–120 lines
Feature interactions
Polynomial + cross-feature terms, pruned by importance
Manual specification, combinatorial explosion risk
Feature selection
SHAP-based pruning, configurable threshold
Recursive elimination or correlation matrix analysis
AutoMLHub
LOC1 flag
Time8s
Coverage94 features auto
Manual
LOC~600 lines
Time3–5 days
CoverageWhatever you thought of
Model Selection
Race 47 candidates. Crown the winner. Skip the argument.
terminal
$automl select --task regression --metric f1 --time-budget 30m
Capability
AutoMLHub
Manual Workflow
Candidate models
47 across tree, linear, neural, ensemble families
Whatever you have time to try (usually 3–5)
Evaluation metric
20+ metrics, multi-objective support
Hardcode metric in training loop
Cross-validation
Stratified k-fold auto-configured
StratifiedKFold boilerplate, ~50 lines
Parallelism
All candidates on available cores simultaneously
Sequential unless you write joblib/Ray logic
Ensemble creation
Stacking + blending from top-k models
Complex custom implementation, 200+ lines
AutoMLHub
LOC1 command
Time28 min
Coverage47 models
Manual
LOC~800 lines
Time1–2 weeks
Coverage3–5 models
Hyperparameter Tuning
Stop guessing. Let Bayesian search find the global optimum.
terminal
$automl tune --model XGBoost_v3 --trials 200 --sampler tpe
Capability
AutoMLHub
Manual Workflow
Search algorithm
Bayesian (TPE), Grid, Random, CMA-ES
GridSearchCV (exhaustive, slow) or manual guessing
Search space definition
Auto-generated from model type
Define param_grid manually, ~40 lines
Early stopping
Automatic with configurable patience
Custom callback implementation
Parallel trials
Distributed across cores/machines
Requires Dask/Ray setup, ~200 lines overhead
Result tracking
Built-in MLflow-compatible experiment log
Set up MLflow or W&B integration separately
AutoMLHub
LOC1 command
Time~2 hours
CoverageBayesian optimal
Manual
LOC~500 lines
Time3–7 days
CoverageGrid or luck
Deployment
From winning model to live REST endpoint: one command.
terminal
$automl deploy --model XGBoost_v3 --env production --replicas 3
Capability
AutoMLHub
Manual Workflow
Serving framework
FastAPI auto-generated with OpenAPI schema
Write Flask/FastAPI app from scratch, ~300 lines
Containerization
Dockerfile auto-generated + optimized
Write Dockerfile, manage base images
Input validation
Pydantic schema inferred from training data
Manual schema definition + validation logic
Scaling config
Replica count + resource limits one flag
Kubernetes YAML manifests, ~150 lines
Rollback
Instant rollback to any previous model version
Custom CI/CD pipeline with versioning logic
AutoMLHub
LOC1 command
Time< 4 min
CoverageREST + gRPC + batch
Manual
LOC~1,200 lines
Time2–4 days
CoverageWhatever you build
Monitoring
Know the moment your model starts lying to your users.
terminal
$automl monitor --model XGBoost_v3 --alert-threshold 0.05
Capability
AutoMLHub
Manual Workflow
Data drift detection
PSI + KS-test on all features, auto-baseline
Custom evidently/whylogs integration, ~300 lines
Performance tracking
Real-time F1/RMSE on labelled stream
Custom logging + metrics pipeline
Alerting
Slack / PagerDuty / webhook out-of-the-box
Custom alert logic + notification integration
Retraining trigger
Auto-trigger on drift threshold breach
Custom cron + threshold logic, ~200 lines
Explainability
SHAP values + feature drift per prediction
Separate SHAP integration, ~100 lines
AutoMLHub
LOC1 command
Time< 30s setup
CoverageAll features
Manual
LOC~900 lines
Time1–2 weeks
CoverageWhat you build
Ready to install
Install in 60 seconds.
One command. No config files. No cloud account. No vendor to call when pricing changes.
pip · terminal
$pip install automlhub
Requires Python 3.9+. Installs CLI + Python SDK.
Quick start — 3 commands to a deployed model
$automl ingest --source data.csv
# ingest & profile your dataset$automl fit --target revenue --deploy
# train, select, tune, deploy$curl api.localhost/predict -d '{"customer_ltv": 8400}'
# call your live endpoint⚖️Apache 2.0Forever free
🔒No telemetryYour data stays yours
🖥️Self-hostedYour infra, your rules
⚡47 modelsPer training run