Decision Tree Classifier

Visualize how classification algorithms build decision trees
Instructions Claude
Prompt utilise pour regenerer cette page :
Page: Decision Tree Classifier
Description: "Visualize how classification algorithms build decision trees"
Icon: "file-tree"
Tags: ["classification", "machine-learning"]
Status: ["validated"]

=== STRUCTURE ===

index.md front matter:
  title: "Decision Tree Classifier"
  description: "Visualize how classification algorithms build decision trees"
  icon: "file-tree"
  tags: ["classification", "machine-learning"]
  status: ["validated"]

index.md body:
  <section class="container visual size-800 ratio-1-1 canvas-contain">
    <canvas id="classifier-canvas"></canvas>
  </section>

=== WIDGET FILES ===

_controls.right.md (weight: 10, title: "Controls"):
  h5 "Controls" then div.classifier-controls with 3 buttons:
    - {{< button id="btn-train" label="Train" >}}
    - {{< button id="btn-reset" label="Reset" >}}
    - {{< button id="btn-step" label="Step" >}}

_stats.right.md (weight: 20, title: "Statistics"):
  h5 "Statistics" then <dl> with 4 stat items:
    - Algorithm (id="stat-algorithm", init "-")
    - Nodes (id="stat-nodes", init "0")
    - Depth (id="stat-depth", init "0")
    - Accuracy (id="stat-accuracy", init "-")

_options.right.md (weight: 30, title: "Options"):
  h5 "Options" then div.classifier-options with 3 controls:
    - Algorithm <select id="algorithm">: C4.5 (Gain Ratio) [value="c45"], ID3 (Information Gain) [value="id3"], CART (Gini) [value="cart"]
    - Dataset <select id="dataset">: Weather (Play Tennis) [value="weather"], Iris (Simple) [value="iris"], Mushroom [value="mushroom"], Titanic [value="titanic"]
    - Max Depth <label> + <input type="range" id="max-depth" min=1 max=10 value=5> + <span id="max-depth-value">5</span>

=== JAVASCRIPT (default.js) ===

Imports:
  - * as AlgoID3 from './_algorithm-id3.lib.js'
  - * as AlgoC45 from './_algorithm-c45.lib.js'
  - * as AlgoCART from './_algorithm-cart.lib.js'
  - * as DataWeather from './_dataset-weather.lib.js'
  - * as DataIris from './_dataset-iris.lib.js'
  - * as DataMushroom from './_dataset-mushroom.lib.js'
  - * as DataTitanic from './_dataset-titanic.lib.js'
  - panic from '/_lib/panic_v3.js'

Pattern: IIFE with 'use strict'

Registries:
  ALGORITHMS: { c45: AlgoC45, id3: AlgoID3, cart: AlgoCART }
  DATASETS: { weather: DataWeather, iris: DataIris, mushroom: DataMushroom, titanic: DataTitanic }

Config: algorithmId='c45', datasetId='weather', maxDepth=5
State: canvas, ctx, tree, cachedColors
Constants: NODE_RADIUS=25, LEVEL_HEIGHT=80, MIN_NODE_SPACING=60

Functions:
  getCachedColors(): lazy CSS cache (bg, primary, secondary, text, muted, leaf=--draw-color-tertiary)
  invalidateColorCache(): null cachedColors
  initCanvasSize(): canvas 800x800, invalidate colors
  buildTree(): get data/attributes/targetAttribute/attributeTypes from dataset module,
    call algorithm.buildTree with correct signature:
      - id3: (data, attributes, targetAttr, 0, maxDepth) -- no attrTypes
      - c45/cart: (data, attributes, targetAttr, attrTypes, 0, maxDepth)
    Compute accuracy by classifying all data rows via algorithm.classify(tree, row),
    counting correct predictions. updateStats, draw.
  updateStats(accuracy): get stats via algorithm.getStats(tree) or AlgoID3.getStats fallback,
    update DOM: stat-algorithm=algorithm.name, stat-nodes, stat-depth, stat-accuracy=accuracy+'%'
  layoutTree(node, depth, left, right): recursive position calculator,
    x=(left+right)/2, y=50+depth*LEVEL_HEIGHT,
    children split width evenly from Object.keys(node.children),
    each child gets edgeLabel property set to its key string
  draw(): clear bg, layoutTree(tree, 0, 0, canvas.width), drawEdges(layout), drawNodes(layout)
  drawEdges(layout, colors): stroke line from (parent.x, parent.y+NODE_RADIUS) to (child.x, child.y-NODE_RADIUS),
    edge label at midpoint: bg-filled rect behind 11px sans-serif text, recursive for children
  drawNodes(layout, colors): decision nodes = filled circle (NODE_RADIUS, primary color) with bold 11px attribute label;
    leaf nodes = rounded rect (60x40, radius 8, leaf color) with bold 12px class label at y-5 and 10px "n=count" at y+10;
    recursive for all children
  reset(): tree=null, draw empty canvas, reset all stat elements to defaults ('-', '0')
  init(): get canvas #classifier-canvas, ctx, initCanvasSize,
    bind btn-train -> buildTree, btn-reset -> reset, btn-step -> buildTree,
    bind select#algorithm change -> update algorithmId/algorithm, rebuild if tree,
    bind select#dataset change -> update datasetId/dataset, reset,
    bind input#max-depth input -> update maxDepth + display span,
    listen prefers-color-scheme change -> invalidateColorCache + draw,
    initial draw()

Auto-init: DOMContentLoaded or immediate

=== ALGORITHM LIBRARIES ===

_algorithm-id3.lib.js:
  Pure ES6 module, no imports.
  Exports: entropy, informationGain, selectBestAttribute, majorityClass,
    buildTree, classify, getStats, name='ID3', description string
  entropy(data, targetAttr): H = -sum(p * log2(p)) over class counts
  informationGain(data, attr, targetAttr): baseEntropy - weighted sum of subset entropies
  selectBestAttribute(data, attributes, targetAttr): iterate attrs, pick max gain
  majorityClass(data, targetAttr): count classes, return most frequent
  buildTree(data, attributes, targetAttr, depth=0, maxDepth=10):
    Base cases: empty data -> leaf(null,0); pure class -> leaf(class, count, distribution);
    no attrs or max depth -> leaf(majorityClass).
    Select best attr, create {type:'decision', attribute, gain, count, children:{}} node,
    multi-way split: for each unique value, filter data, recurse with remaining attrs
  classify(tree, instance): traverse by attribute values, handle unknown values via max-count child
  getStats(tree): recursive traverse counting nodes, leaves, maxDepth

_algorithm-c45.lib.js:
  Imports entropy, majorityClass from _algorithm-id3.lib.js.
  Exports: splitInfo, informationGain, gainRatio, findBestThreshold,
    selectBestAttribute, buildTree, classify, getStats, name='C4.5', description
  splitInfo(data, attr): entropy-like calculation over attribute value distribution
  gainRatio(data, attr, targetAttr): informationGain / splitInfo (0 if split=0)
  findBestThreshold(data, attr, targetAttr): sort by continuous attr, try splits between
    class boundaries, compute binary gain ratio, return {threshold, gainRatio}
  selectBestAttribute: compute average gain across all attrs, only consider attrs with
    gain >= average; for continuous: call findBestThreshold; return {attribute, gainRatio, threshold}
  buildTree(data, attributes, targetAttr, attrTypes={}, depth=0, maxDepth=10):
    continuous attrs -> binary split with children keys "<= X.XX" / "> X.XX" (formatThreshold: toFixed(2));
    categorical attrs -> multi-way split, remove used attr from remaining
    Node stores: type='decision', attribute, gainRatio, threshold, count, children
  classify(tree, instance): if threshold not null, find child key starting with "<=" or ">"
    based on value comparison; else traverse by categorical value

_algorithm-cart.lib.js:
  Pure ES6 module, no imports.
  Exports: gini, giniGain, findBestCategoricalSplit, findBestContinuousSplit,
    selectBestSplit, majorityClass, buildTree, classify, getStats, name='CART', description
  gini(data, targetAttr): 1 - sum(p^2)
  giniGain(data, left, right, targetAttr): baseGini - weighted child gini
  findBestCategoricalSplit: for each value, binary split "= val" vs "!= val", pick max gain
  findBestContinuousSplit: sort by attr, try all thresholds, return {threshold, gain}
  selectBestSplit: iterate attrs, pick best overall split {attribute, gain, split}
  buildTree(data, attributes, targetAttr, attrTypes={}, depth=0, maxDepth=10, minSamples=2):
    Always binary: continuous -> "<= X.XX" / "> X.XX"; categorical -> "= val" / "!= val"
    Leaf nodes store gini value; decision nodes store threshold or splitValue
    Stop if gain <= 0 or data.length < minSamples
  classify(tree, instance): check tree.threshold (continuous) or tree.splitValue (categorical)
    to pick left (keys[0]) or right (keys[1]) child
  formatValue: toFixed(2) for numbers, String() for others

=== DATASET LIBRARIES ===

_dataset-weather.lib.js:
  name='Weather (Play Tennis)', targetAttribute='play'
  attributes: ['outlook', 'temperature', 'humidity', 'windy'], all categorical
  attributeValues: outlook=[sunny,overcast,rain], temperature=[hot,mild,cool],
    humidity=[high,normal], windy=[true,false], play=[yes,no]
  14 rows of classic play tennis data

_dataset-iris.lib.js:
  name='Iris (Simplified)', targetAttribute='species'
  attributes: ['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], all categorical
  Values: short/medium/long for length, narrow/medium/wide for width
  24 rows, 8 per class (setosa, versicolor, virginica)

_dataset-mushroom.lib.js:
  name='Mushroom', targetAttribute='class'
  attributes: ['cap_shape', 'cap_color', 'odor', 'gill_size', 'stalk_shape'], all categorical
  20 rows (10 edible, 10 poisonous), key discriminator is odor (foul/spicy = poisonous)

_dataset-titanic.lib.js:
  name='Titanic', targetAttribute='survived'
  attributes: ['pclass', 'sex', 'age_group', 'embarked'], all categorical
  24 rows (12 yes, 12 no), pattern: women/children + higher class = survived

=== SCSS (default.scss) ===

.classifier-controls: flex, gap var(--layout-spacing, 1rem), flex-wrap,
  .button: flex:1, min-width 80px
.classifier-options: flex column, gap 0.5rem
  label: font-weight 600, 0.875rem, margin-top 0.5rem (first-child margin-top 0)
  select + input[type="range"]: width 100%
  input[type="range"]: cursor pointer
dl: display grid, grid-template-columns 1fr 1fr, gap 0.25rem 0.5rem, margin 0
  dt: font-weight 600, color var(--text-color-muted)
  dd: margin 0, text-align right, font-family monospace
Page entierement generee et maintenue par IA, sans intervention humaine.