Prompt utilise pour regenerer cette page :
Page: Decision Tree Classifier
Description: "Visualize how classification algorithms build decision trees"
Icon: "file-tree"
Tags: ["classification", "machine-learning"]
Status: ["validated"]
=== STRUCTURE ===
index.md front matter:
title: "Decision Tree Classifier"
description: "Visualize how classification algorithms build decision trees"
icon: "file-tree"
tags: ["classification", "machine-learning"]
status: ["validated"]
index.md body:
<section class="container visual size-800 ratio-1-1 canvas-contain">
<canvas id="classifier-canvas"></canvas>
</section>
=== WIDGET FILES ===
_controls.right.md (weight: 10, title: "Controls"):
h5 "Controls" then div.classifier-controls with 3 buttons:
- {{< button id="btn-train" label="Train" >}}
- {{< button id="btn-reset" label="Reset" >}}
- {{< button id="btn-step" label="Step" >}}
_stats.right.md (weight: 20, title: "Statistics"):
h5 "Statistics" then <dl> with 4 stat items:
- Algorithm (id="stat-algorithm", init "-")
- Nodes (id="stat-nodes", init "0")
- Depth (id="stat-depth", init "0")
- Accuracy (id="stat-accuracy", init "-")
_options.right.md (weight: 30, title: "Options"):
h5 "Options" then div.classifier-options with 3 controls:
- Algorithm <select id="algorithm">: C4.5 (Gain Ratio) [value="c45"], ID3 (Information Gain) [value="id3"], CART (Gini) [value="cart"]
- Dataset <select id="dataset">: Weather (Play Tennis) [value="weather"], Iris (Simple) [value="iris"], Mushroom [value="mushroom"], Titanic [value="titanic"]
- Max Depth <label> + <input type="range" id="max-depth" min=1 max=10 value=5> + <span id="max-depth-value">5</span>
=== JAVASCRIPT (default.js) ===
Imports:
- * as AlgoID3 from './_algorithm-id3.lib.js'
- * as AlgoC45 from './_algorithm-c45.lib.js'
- * as AlgoCART from './_algorithm-cart.lib.js'
- * as DataWeather from './_dataset-weather.lib.js'
- * as DataIris from './_dataset-iris.lib.js'
- * as DataMushroom from './_dataset-mushroom.lib.js'
- * as DataTitanic from './_dataset-titanic.lib.js'
- panic from '/_lib/panic_v3.js'
Pattern: IIFE with 'use strict'
Registries:
ALGORITHMS: { c45: AlgoC45, id3: AlgoID3, cart: AlgoCART }
DATASETS: { weather: DataWeather, iris: DataIris, mushroom: DataMushroom, titanic: DataTitanic }
Config: algorithmId='c45', datasetId='weather', maxDepth=5
State: canvas, ctx, tree, cachedColors
Constants: NODE_RADIUS=25, LEVEL_HEIGHT=80, MIN_NODE_SPACING=60
Functions:
getCachedColors(): lazy CSS cache (bg, primary, secondary, text, muted, leaf=--draw-color-tertiary)
invalidateColorCache(): null cachedColors
initCanvasSize(): canvas 800x800, invalidate colors
buildTree(): get data/attributes/targetAttribute/attributeTypes from dataset module,
call algorithm.buildTree with correct signature:
- id3: (data, attributes, targetAttr, 0, maxDepth) -- no attrTypes
- c45/cart: (data, attributes, targetAttr, attrTypes, 0, maxDepth)
Compute accuracy by classifying all data rows via algorithm.classify(tree, row),
counting correct predictions. updateStats, draw.
updateStats(accuracy): get stats via algorithm.getStats(tree) or AlgoID3.getStats fallback,
update DOM: stat-algorithm=algorithm.name, stat-nodes, stat-depth, stat-accuracy=accuracy+'%'
layoutTree(node, depth, left, right): recursive position calculator,
x=(left+right)/2, y=50+depth*LEVEL_HEIGHT,
children split width evenly from Object.keys(node.children),
each child gets edgeLabel property set to its key string
draw(): clear bg, layoutTree(tree, 0, 0, canvas.width), drawEdges(layout), drawNodes(layout)
drawEdges(layout, colors): stroke line from (parent.x, parent.y+NODE_RADIUS) to (child.x, child.y-NODE_RADIUS),
edge label at midpoint: bg-filled rect behind 11px sans-serif text, recursive for children
drawNodes(layout, colors): decision nodes = filled circle (NODE_RADIUS, primary color) with bold 11px attribute label;
leaf nodes = rounded rect (60x40, radius 8, leaf color) with bold 12px class label at y-5 and 10px "n=count" at y+10;
recursive for all children
reset(): tree=null, draw empty canvas, reset all stat elements to defaults ('-', '0')
init(): get canvas #classifier-canvas, ctx, initCanvasSize,
bind btn-train -> buildTree, btn-reset -> reset, btn-step -> buildTree,
bind select#algorithm change -> update algorithmId/algorithm, rebuild if tree,
bind select#dataset change -> update datasetId/dataset, reset,
bind input#max-depth input -> update maxDepth + display span,
listen prefers-color-scheme change -> invalidateColorCache + draw,
initial draw()
Auto-init: DOMContentLoaded or immediate
=== ALGORITHM LIBRARIES ===
_algorithm-id3.lib.js:
Pure ES6 module, no imports.
Exports: entropy, informationGain, selectBestAttribute, majorityClass,
buildTree, classify, getStats, name='ID3', description string
entropy(data, targetAttr): H = -sum(p * log2(p)) over class counts
informationGain(data, attr, targetAttr): baseEntropy - weighted sum of subset entropies
selectBestAttribute(data, attributes, targetAttr): iterate attrs, pick max gain
majorityClass(data, targetAttr): count classes, return most frequent
buildTree(data, attributes, targetAttr, depth=0, maxDepth=10):
Base cases: empty data -> leaf(null,0); pure class -> leaf(class, count, distribution);
no attrs or max depth -> leaf(majorityClass).
Select best attr, create {type:'decision', attribute, gain, count, children:{}} node,
multi-way split: for each unique value, filter data, recurse with remaining attrs
classify(tree, instance): traverse by attribute values, handle unknown values via max-count child
getStats(tree): recursive traverse counting nodes, leaves, maxDepth
_algorithm-c45.lib.js:
Imports entropy, majorityClass from _algorithm-id3.lib.js.
Exports: splitInfo, informationGain, gainRatio, findBestThreshold,
selectBestAttribute, buildTree, classify, getStats, name='C4.5', description
splitInfo(data, attr): entropy-like calculation over attribute value distribution
gainRatio(data, attr, targetAttr): informationGain / splitInfo (0 if split=0)
findBestThreshold(data, attr, targetAttr): sort by continuous attr, try splits between
class boundaries, compute binary gain ratio, return {threshold, gainRatio}
selectBestAttribute: compute average gain across all attrs, only consider attrs with
gain >= average; for continuous: call findBestThreshold; return {attribute, gainRatio, threshold}
buildTree(data, attributes, targetAttr, attrTypes={}, depth=0, maxDepth=10):
continuous attrs -> binary split with children keys "<= X.XX" / "> X.XX" (formatThreshold: toFixed(2));
categorical attrs -> multi-way split, remove used attr from remaining
Node stores: type='decision', attribute, gainRatio, threshold, count, children
classify(tree, instance): if threshold not null, find child key starting with "<=" or ">"
based on value comparison; else traverse by categorical value
_algorithm-cart.lib.js:
Pure ES6 module, no imports.
Exports: gini, giniGain, findBestCategoricalSplit, findBestContinuousSplit,
selectBestSplit, majorityClass, buildTree, classify, getStats, name='CART', description
gini(data, targetAttr): 1 - sum(p^2)
giniGain(data, left, right, targetAttr): baseGini - weighted child gini
findBestCategoricalSplit: for each value, binary split "= val" vs "!= val", pick max gain
findBestContinuousSplit: sort by attr, try all thresholds, return {threshold, gain}
selectBestSplit: iterate attrs, pick best overall split {attribute, gain, split}
buildTree(data, attributes, targetAttr, attrTypes={}, depth=0, maxDepth=10, minSamples=2):
Always binary: continuous -> "<= X.XX" / "> X.XX"; categorical -> "= val" / "!= val"
Leaf nodes store gini value; decision nodes store threshold or splitValue
Stop if gain <= 0 or data.length < minSamples
classify(tree, instance): check tree.threshold (continuous) or tree.splitValue (categorical)
to pick left (keys[0]) or right (keys[1]) child
formatValue: toFixed(2) for numbers, String() for others
=== DATASET LIBRARIES ===
_dataset-weather.lib.js:
name='Weather (Play Tennis)', targetAttribute='play'
attributes: ['outlook', 'temperature', 'humidity', 'windy'], all categorical
attributeValues: outlook=[sunny,overcast,rain], temperature=[hot,mild,cool],
humidity=[high,normal], windy=[true,false], play=[yes,no]
14 rows of classic play tennis data
_dataset-iris.lib.js:
name='Iris (Simplified)', targetAttribute='species'
attributes: ['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], all categorical
Values: short/medium/long for length, narrow/medium/wide for width
24 rows, 8 per class (setosa, versicolor, virginica)
_dataset-mushroom.lib.js:
name='Mushroom', targetAttribute='class'
attributes: ['cap_shape', 'cap_color', 'odor', 'gill_size', 'stalk_shape'], all categorical
20 rows (10 edible, 10 poisonous), key discriminator is odor (foul/spicy = poisonous)
_dataset-titanic.lib.js:
name='Titanic', targetAttribute='survived'
attributes: ['pclass', 'sex', 'age_group', 'embarked'], all categorical
24 rows (12 yes, 12 no), pattern: women/children + higher class = survived
=== SCSS (default.scss) ===
.classifier-controls: flex, gap var(--layout-spacing, 1rem), flex-wrap,
.button: flex:1, min-width 80px
.classifier-options: flex column, gap 0.5rem
label: font-weight 600, 0.875rem, margin-top 0.5rem (first-child margin-top 0)
select + input[type="range"]: width 100%
input[type="range"]: cursor pointer
dl: display grid, grid-template-columns 1fr 1fr, gap 0.25rem 0.5rem, margin 0
dt: font-weight 600, color var(--text-color-muted)
dd: margin 0, text-align right, font-family monospace
Page entierement generee et maintenue par IA, sans intervention humaine.