NoCode ML Explorer - Learn Machine Learning Visually

Step 1: Choose Your Analysis Method

Select the type of analysis you want to perform. Each method has different data requirements.

Customer Segmentation

K-Means Clustering

When to use: When you want to discover natural groups in your data without knowing them beforehand.

Question answered: "Are there distinct groups of customers with similar characteristics?"

Business examples:

Group customers by purchasing behavior
Identify market segments for targeted marketing
Find patterns in patient health data

Data Requirements:

At least 2 numeric columns
No target variable needed

Logistic Regression

Probability & Odds Analysis

When to use: When you want to know HOW MUCH each factor influences an outcome and calculate probabilities.

Question answered: "What is the probability of X happening, and which factors matter most?"

Business examples:

Probability a customer will buy (with factors)
Risk scoring for insurance/credit
A/B test analysis with multiple variables

Data Requirements:

A binary target (Yes/No, 0/1)
Numeric features work best

Linear Regression

Predict Numeric Values

When to use: When you want to predict a continuous number and understand which factors drive it.

Question answered: "What value should we expect, and which factors matter most?"

Business examples:

Predict sales based on marketing spend
Estimate house prices from features
Forecast revenue from customer data

Data Requirements:

A numeric target to predict
Numeric feature columns

Classification Tree

Predict Categories with Rules

When to use: When you need clear, explainable rules to classify items into categories.

Question answered: "What rules determine which category something belongs to?"

Business examples:

Loan approval criteria (If income > X and debt < Y...)
Customer churn prediction with clear reasons
Medical diagnosis flowcharts

Data Requirements:

A categorical target to predict
Feature columns (numeric or categorical)

Regression Tree

Predict Numbers with Rules

When to use: When you need explainable rules to predict numeric values and see feature thresholds.

Question answered: "What value should we predict and what rules lead to it?"

Business examples:

Price prediction with clear rule paths
Sales forecasting with decision thresholds
Risk scoring with transparent logic

Data Requirements:

A numeric target to predict
Feature columns (numeric or categorical)

Method selected! Now upload your data below.

Step 2: Upload Your Data

Drag & Drop your CSV file here

or click to browse

CSV files up to 5MB, max 10,000 rows and 50 columns

Don't have data?

Step 3: Data Health Dashboard

Total Rows

Total Columns

Missing Values

Health Score

Column Analysis

Column Name	Type	Missing	Unique Values	Statistics

Data Preview (First 10 Rows)

Data Cleaning Options

Handle Missing Values

No missing values detected. Your data is complete!

Select how to handle missing values for each column. Remove rows deletes any row where this column is blank. Fill with median (numeric columns only) replaces blanks with the middle value.

Column	Type	Missing	Action

Detect Outliers

Detect and optionally remove statistical outliers from numeric columns.

Select column to check:

Select a column to detect outliers

Step 4: Machine Learning Analysis

Understanding Customer Segmentation

K-Means clustering finds natural groupings in your data by identifying customers who are similar to each other. Think of it like sorting students into study groups based on their learning styles and interests.

Your goal: Select features that describe your customers (like age, income, spending), choose a number of clusters, and let the algorithm find meaningful segments. Then interpret what makes each segment unique.

Note: Features are automatically standardized (scaled to have mean=0, std=1) before clustering. This ensures features with larger ranges (like income) don't dominate over smaller-ranged features (like age).

Clustering Settings

Select Features (at least 2) Hold Ctrl/Cmd to select multiple

Number of Clusters (K)

Show scatter plots for all feature pairs

Useful for exploring relationships between features

Tip: K-Means groups similar customers together based on their characteristics. Start with 3-5 clusters and adjust based on results.

Cluster Visualization

Select features and click "Run Clustering" to visualize customer segments

Finding the Best Number of Clusters

How to Interpret These Results

Silhouette Score (0 to 1): Measures how similar items are to their own cluster vs. others. >0.5 is good, >0.7 is excellent.
Davies-Bouldin Index: Measures cluster separation. Lower values mean better-defined clusters.
Cluster Statistics: Look at the averages for each cluster to understand what makes each group unique. Give each cluster a descriptive name based on its characteristics (e.g., "High Spenders", "Budget Conscious").

Silhouette Score

-

Davies-Bouldin Index

-

Lower is better

Inertia

-

Within-cluster variance

Cluster Statistics (with Intra-cluster Distance)

Inter-Cluster Distances

Centroids

Per-Cluster Silhouette Scores

Managerial Insights

Feature Pair Scatter Plots

Each plot shows how clusters are distributed across two features. Look for clear separation between colors to identify well-defined segments.

Understanding Decision Trees

A decision tree creates a series of yes/no questions to classify data - like a flowchart. Each branch represents a decision rule (e.g., "Is income > $50,000?"), and each leaf is a final prediction.

Your goal: Select what you want to predict (target) and which features to use. The tree will show you exactly which factors matter most and the rules it uses to make predictions. Great for explaining decisions to stakeholders!

Decision Tree Settings

Target Variable (What to predict)

Select Features (Predictors) Hold Ctrl/Cmd to select multiple

Max Tree Depth

Data Split Options

Test Set Size: 20% Percentage of data reserved for testing

Use Stratified Sampling

Random Seed (optional) Leave empty for default (42)

Overfitting Warning:
Deeper trees may memorize instead of learning patterns.

Decision Tree Visualization

Select a target, features, and click "Train Decision Tree" to visualize the If/Then logic

How to Interpret These Results

Test Accuracy: The real performance - how well the model predicts on data it hasn't seen. 70-80% is good for most business problems.
Training vs Test Gap: If training accuracy is much higher than test, the model is "memorizing" rather than learning (overfitting).
The Tree Diagram: Read from top to bottom. Each box shows a decision rule. Follow the branches to see how predictions are made. Features near the top are most important.
Feature Importance: Shows which factors contribute most to predictions. Focus on the top 3-5 features for insights.

Variable Encoding Reference

Why encoding? Machine learning models work with numbers. Text categories like "Yes/No" or "Male/Female" are converted to numbers (0, 1, 2, etc.) so the algorithm can process them. Use this table to interpret the tree diagram and rules above.

Test Accuracy (Real Performance)

--%

Training Accuracy

--%

How well it learned the training data

Overfitting Check

Feature Importance

Top Decision Rules (If/Then)

Managerial Report

Understanding Logistic Regression

Logistic regression predicts the probability of an outcome and tells you exactly how much each factor influences that probability. Unlike decision trees, it gives you coefficients that quantify each factor's impact.

Your goal: Select a binary outcome to predict (Yes/No, Buy/Don't Buy) and the features that might influence it. The results will show you odds ratios (e.g., "each $10K income increase doubles the odds of purchase") and let you adjust the decision threshold to balance false positives vs. false negatives.

Logistic Regression Settings

Target Variable (What to predict) Best with binary outcomes (Yes/No)

Select Features (Predictors) Hold Ctrl/Cmd to select multiple

Data Split Options

Test Set Size: 20%

Use Stratified Sampling

Random Seed (optional)

Decision Threshold: 0.50 Default is 0.5. Adjust to balance precision vs recall.

Tip: Logistic regression shows you how much each feature affects the probability of the outcome through odds ratios.

Coefficient Analysis

Configure settings and click "Train Logistic Regression" to see feature coefficients

How to Interpret These Results

Accuracy: Overall % of correct predictions. But for imbalanced data, also check precision and recall.
Precision: When we predict "Yes", how often are we right? High precision = few false alarms.
Recall: Of all actual "Yes" cases, how many did we catch? High recall = few missed opportunities.
AUC (Area Under Curve): Overall model quality (0.5 = random guess, 1.0 = perfect). >0.7 is acceptable, >0.8 is good.
Coefficients: Positive = increases probability, Negative = decreases probability. Larger absolute value = stronger effect.
Odds Ratio: How much the odds multiply for each unit increase. OR=2 means odds double; OR=0.5 means odds halve.

Variable Encoding Reference

Test Accuracy

--%

Train Accuracy

--%

Precision

--%

Recall

--%

F1 Score

--%

AUC

--

Overfitting Check

Confusion Matrix

ROC Curve

Feature Effects (Odds Ratios)

Managerial Report

Understanding Linear Regression

Linear regression predicts a continuous numeric value (like sales, price, or score) based on input features. It finds the best-fit line through your data, showing you exactly how much each factor contributes to the outcome.

Your goal: Select a numeric outcome to predict and the features that might influence it. The results will show you R² (how well the model fits), p-values (which factors are statistically significant), and residuals (prediction errors).

Regression Settings

Target Variable (What to predict) Must be numeric (continuous values)

Select Features (Predictors) Hold Ctrl/Cmd to select multiple. More features = multiple regression.

Use Train/Test Split

Test Set Size: 20%

Random Seed (optional)

Tip: Check the p-values to see which features are statistically significant (p < 0.05). Also check residuals - they should be randomly scattered for a good model.

Coefficient Analysis

Configure settings and click "Train Regression Model" to see feature coefficients

How to Interpret These Results

R² (R-Squared): How much of the variation in your target is explained by the model. R²=0.80 means 80% is explained. Higher is better, but >0.95 might indicate overfitting.
Adjusted R²: Like R², but penalizes adding features that don't help. Use this to compare models with different numbers of features.
RMSE: Average prediction error in same units as target. If predicting price in $, RMSE of $1000 means typical error is about $1000.
MAE: Average absolute error - easier to interpret than RMSE, less sensitive to outliers.
P-Values: p < 0.05 means the feature is statistically significant. Very small p-values (0.001) indicate strong evidence.
Residuals: Should be randomly scattered around zero. Patterns in residuals suggest the model is missing something.

R² (Test)

--

R² (Train)

--

Adj R²

--

RMSE

--

MAE

--

F-Statistic

--

Model Quality Check

Coefficient Details & P-Values

Feature	Coefficient	Std Error	t-Statistic	P-Value	Significance

Feature vs Target Scatter Plots

Residuals vs Fitted

Residual Distribution

Actual vs Predicted Values

Regression Equation

Managerial Report

Understanding Regression Trees

A regression tree predicts numeric values by creating if/then rules, just like a classification tree. Instead of predicting categories, each leaf node predicts a number (the average of training samples in that group).

Your goal: Select a numeric variable to predict and features that might influence it. The tree will show you decision rules that lead to different predicted values - great for understanding what drives higher or lower outcomes!

Regression Tree Settings

Target Variable (Numeric value to predict) Must be a numeric column (e.g., price, sales)

Select Features (Predictors) Hold Ctrl/Cmd to select multiple

Max Tree Depth

Data Split Options

Test Set Size: 20% Percentage of data reserved for testing

Random Seed (optional) Leave empty for default (42)

Overfitting Warning: Deeper trees may memorize instead of learning patterns. Watch for big gaps between training and test R².

Regression Tree Visualization

Select a numeric target, features, and click "Train Regression Tree" to visualize the prediction rules

How to Interpret These Results

R² (R-squared): The percentage of variance explained by the model. 70%+ is good, 90%+ is excellent.
RMSE (Root Mean Square Error): Average prediction error in the same units as your target. Lower is better.
MAE (Mean Absolute Error): Average absolute difference between predicted and actual values.
Training vs Test Gap: If training R² is much higher than test R², the model may be overfitting.
The Tree Diagram: Each leaf shows a predicted value. Follow the rules from top to bottom to see how predictions are made.

Variable Encoding Reference

Why encoding? Categorical features like "Yes/No" are converted to numbers so the algorithm can process them.

Test R² (Real Performance)

--%

Training R²

--%

How well it learned training data

RMSE (Test)

Average prediction error

Overfitting Check

Target Variable Stats

Mean:	--
Std Dev:	--
Min:	--
Max:	--

Feature Importance

Top Prediction Rules

Managerial Report

Step 5: Your Analysis Summary

After reviewing your analysis results, write a summary of what the data is telling you. What patterns did you discover? What insights are most important for decision-making? This summary will be included in your PDF report.

ML Glossary - Select an analysis method to see relevant terms

Entropy / MSE

In classification trees, entropy measures "messiness" or uncertainty - high entropy means mixed data. In regression trees, we use MSE (Mean Squared Error) instead, measuring how spread out values are. Both types of trees try to reduce these measures by splitting data into cleaner groups.

Gini Impurity

Gini measures how often you'd be wrong if you randomly guessed a label. Lower Gini = purer groups = better! It's like asking "if I picked a random customer from this group, how likely am I to misclassify them?"

Overfitting

Overfitting is like memorizing answers instead of learning concepts. The model becomes too specific to your training data and fails on new data. It's like a student who memorizes test answers but can't apply knowledge to new problems.

Centroids

A centroid is the "center point" of a cluster - imagine it as the average customer in each group. K-Means positions centroids to minimize the distance between each point and its nearest centroid.

Clusters

Clusters are groups of similar data points. Think of it as automatically sorting customers into groups based on their behavior - like "budget shoppers", "premium buyers", and "occasional visitors".

Features

Features are the characteristics or attributes you use to make predictions. For customers, features might include age, income, purchase frequency, etc. They're the "inputs" to your model.

Variable Encoding

Machine learning algorithms work with numbers, not text. Encoding converts text categories (like "Male/Female" or "Yes/No") into numbers (0, 1, 2...). When you see a number in the results, check the encoding table to see what category it represents.

Target Variable

The target is what you're trying to predict - the "answer" you want the model to learn. For example, "Will this customer buy?" or "Is this email spam?"

Accuracy

Accuracy is simply "how often is the model correct?" If a model has 85% accuracy, it makes the right prediction 85 out of 100 times. Higher is better!

K (Number of Clusters)

K is how many groups you want to create. Choosing K is part art, part science. Too few clusters = oversimplified groups. Too many = overly specific groups that don't generalize well.

Max Depth

Max depth limits how "tall" your decision tree can grow. A deeper tree can learn more complex patterns but risks overfitting. A shallower tree is simpler and more generalizable.

Silhouette Score

The silhouette score measures how similar points are to their own cluster vs other clusters. Ranges from -1 to 1: scores near 1 mean well-separated clusters, near 0 means overlapping clusters, negative means points might be in the wrong cluster.

Inertia (Within-cluster variance)

Inertia measures how spread out points are within each cluster. Lower inertia = tighter clusters. The "elbow method" plots inertia vs K to find where adding more clusters stops helping much.

Davies-Bouldin Index

This index measures cluster separation - specifically, how distinct clusters are from each other. Lower values indicate better-defined, more separated clusters. Aim for values closer to 0.

Train/Test Split

We divide data into two parts: "training data" to teach the model, and "test data" to evaluate it. This simulates how the model will perform on new, unseen data. Typical splits are 80/20 or 70/30.

Stratified Sampling

Stratified sampling ensures the train and test sets have the same proportion of each class as the original data. This is important when classes are imbalanced (e.g., 90% buyers, 10% non-buyers).

Random Seed

A random seed makes results reproducible. Using the same seed will always produce the same random split, so you can compare different model settings fairly. Change the seed to see how stable your results are.

Logistic Regression

Despite its name, logistic regression is used for classification (not regression). It predicts the probability of an outcome (like "Will Buy: Yes/No") based on input features. It's great for understanding which factors influence a decision.

Odds Ratio

The odds ratio tells you how much the odds of the outcome change when a feature increases by 1 unit. An odds ratio of 2 means the outcome is twice as likely. Less than 1 means less likely. Equal to 1 means no effect.

Coefficient

In logistic regression, coefficients show the direction and strength of each feature's influence. Positive = increases likelihood of the outcome. Negative = decreases likelihood. Larger absolute value = stronger effect.

ROC Curve

The ROC (Receiver Operating Characteristic) curve shows how well the model distinguishes between classes at different threshold settings. The area under this curve (AUC) measures overall performance: 1.0 is perfect, 0.5 is random guessing.

Precision & Recall

Precision answers "Of all positive predictions, how many were correct?" Recall answers "Of all actual positives, how many did we find?" High precision = few false alarms. High recall = few missed cases.

Linear Regression

Linear regression predicts a continuous numeric outcome (like sales or price) based on input features. It finds the best-fit line through your data points. Use it when you want to predict "how much" rather than "which category."

Multiple Regression

Multiple regression extends linear regression to use multiple input features. It helps you understand how several factors together influence an outcome, and which factors have the strongest impact.

R-Squared (R²)

R² measures how well the model explains the variation in your data. It ranges from 0 to 1. R²=0.80 means 80% of the variation is explained by the model. Higher is better, but very high values (>0.95) might indicate overfitting.

Adjusted R-Squared

Adjusted R² accounts for the number of features in the model. Unlike regular R², it penalizes adding features that don't improve predictions. Use it to compare models with different numbers of features.

P-Value

The p-value tests if a feature's effect is statistically significant. P < 0.05 is commonly considered significant, meaning there's less than 5% chance the effect is due to random chance. Lower p-values indicate stronger evidence.

Residuals

Residuals are the differences between actual and predicted values. Good models have residuals that are randomly scattered around zero. Patterns in residuals may indicate the model is missing something important.

RMSE (Root Mean Square Error)

RMSE measures the average prediction error in the same units as your target variable. If predicting sales in dollars, RMSE of 100 means predictions are typically off by about $100. Lower is better.

MAE (Mean Absolute Error)

MAE is the average absolute difference between predictions and actual values. Unlike RMSE, it doesn't penalize large errors as heavily. It's easier to interpret: MAE of 50 means predictions are off by 50 on average.

Regression Tree

A regression tree predicts numeric values using if/then rules, just like a classification tree but for continuous outcomes. Each leaf node contains the average value of training samples that reached that node. Great for understanding what drives higher or lower values.

Select an analysis method above to see relevant glossary terms.