LSTM Interactive Playground

Sequence Unrolling

See how a single RNN cell is reused across every timestep and how memory fades.

Key Concepts

Weight Sharing: The same weights (W, b) are shared at every step.
Hidden State (h_t): Flows from left to right, carrying context.
Update Rule: h_t = tanh(W·[h_t-1, p_t] + b)

How to Use

Input: Type a sentence to see it unrolled.
Sliders: Adjust RNN Decay vs LSTM Forget Bias to see the difference in memory retention.

What to Observe

RNN memory fades quickly (darker cells)
LSTM maintains stronger gradients (brighter cells)
Higher forget bias = better long-term memory

Experiment Input:

RNN Alpha (Decay Rate): 0.85

LSTM Forget Bias: 0.98

Analytics: Retention Strength

RNN Gradient Flow

LSTM Gradient Flow

Inside the LSTM Cell

A microscopic view of the internal gates that make LSTMs powerful.

Key Concepts

Forget Gate: f_t = σ(W_f·[h_t-1, p_t] + b_f)
Input Gate: i_t = σ(W_i·[h_t-1, p_t] + b_i)
Candidate Cell: C̃_t = tanh(W_c·[h_t-1, p_t] + b_c)
Cell Update: C_t = f_t⊙C_t-1 + i_t⊙C̃_t
Output Gate: o_t = σ(W_o·[h_t-1, p_t] + b_o)
Hidden State: h_t = o_t⊙tanh(C_t)

How to Use

Sliders: Manually adjust gate activations to see how they affect the output.
Inject: Add positive/negative words to see how the cell reacts.

Forget Gate (σ): 70%

Decides how much of C_t-1 to keep.

Input Gate (σ · tanh): 50%

Decides what new info to store in C_t.

Output Gate (σ): 80%

Filters the output h_t from C_t.

Word-to-Vector Pipeline

Waiting for input...

Sentiment Timeline

Visualize how RNN and LSTM accumulate sentiment word-by-word through a sentence.

Key Concepts

Accumulation: Each word shifts the running sentiment score.
Negation: Words like "not" flip the polarity of the next word.
Stability: LSTM smooths out fluctuation; RNN oscillates sharply.

How to Use

Edit: Type any sentence to see live sentiment curves.
Toggle: Switch between RNN and LSTM models.
Insert "not": Add negation to see polarity flip.

Sentence Input:

Final Prediction

Negative 0.00 Positive

Word Influence Heatmap

Click any word to see its contribution

📝 Key Insight: Word order matters in sentiment analysis. The word "not" before "good" flips the sentiment entirely. LSTMs handle this through their gating mechanism, while RNNs struggle to maintain context over longer distances.

Training Dynamics

Compare how RNN and LSTM converge over 150 training epochs.

Key Concepts

Overfitting: Training accuracy rises while validation plateaus.
Vanishing Gradient: RNN gradient bars shrink over time.
Convergence: LSTM achieves smoother, higher validation accuracy.

Experiment Parameters

Epochs: 150
Batch Size: 128
Optimizer: Adam
LR RNN: 3e-5 | LR LSTM: 1e-5

Hyperparameters

Learning Rate: 3e-5

Batch Size:

Sequence Length: 300

Experiment Setup

Vocab Size: 10,000
Embed Dim: 256
RNN Units: 256
LSTM Units: 64

RNN

Gradient Magnitude

Confusion Matrix

LSTM

Gradient Magnitude

Confusion Matrix

📝 Key Insight: The RNN's gradient bars shrink over epochs (vanishing gradient), causing its validation accuracy to plateau around 79%. The LSTM maintains stable gradients through its cell state highway, converging smoothly to ~87% validation accuracy.

Feature Attribution

Explore which words influence the model's prediction the most.

Key Concepts

Saliency: Measures each word's gradient-based importance.
Local vs Global: RNN focuses locally; LSTM spreads influence.
Cell State: LSTM's internal memory trace over time.

How to Use

Toggle: Switch between RNN and LSTM attribution.
Threshold: Slide to hide low-importance words.
Compare: Use side-by-side mode for two sentences.

Sentence A:

Saliency Threshold: 0.10

Word Attribution — Sentence A

LSTM Cell State Magnitude

Compare Mode — Sentence B

Sentence B:

LSTM Cell State Magnitude — Sentence B

📝 Key Insight: RNNs assign strong influence to recent sentiment words but weak influence to early context. LSTMs distribute influence more evenly, allowing context words to shape the cell state over time. Compare sentences A and B to see how word order changes attribution.

Model Comparison

A professional dashboard summarizing the full experiment results.

Key Concepts

Generalization: Test accuracy reveals true model capability.
ROC/AUC: Measures classification quality across thresholds.
F1 Score: Harmonic mean of precision and recall.

Dataset

IMDB Movie Reviews (Binary Sentiment)
Train: 7,000 | Val: 1,500 | Test: 1,500
Balanced: 750 positive / 750 negative

📊 Accuracy Comparison

Metric	RNN	LSTM
Train Accuracy	0%	0%
Val Accuracy	0%	0%
Test Accuracy	0%	0%

📈 Accuracy Bar Chart

📉 ROC Curve

📉 Precision-Recall Curve

🎯 F1 Score Comparison

RNN

78.5%

LSTM

84.2%

🔑 Key Takeaways:

LSTM handles long-term dependencies significantly better than RNN.
RNN suffers from vanishing gradient, limiting its learning capacity.
LSTM generalizes better — higher validation and test performance.
Despite using only 64 units vs RNN's 256, LSTM outperforms across all metrics.