Virtual Labs

Long Short-Term Memory (LSTM) for Sentiment Analysis

1. Why is text preprocessing necessary before training the RNN and LSTM models?

a: To remove noise such as punctuation, digits, and extra spaces Explanation

Explanation

b: To increase the number of words in each review Explanation

Explanation

c: To convert labels into numeric form Explanation

Explanation

d: To increase the training epochs automatically Explanation

Explanation

2. What is the purpose of tokenization in the experiment?

a: To directly classify text into positive or negative classes Explanation

Explanation

b: To map words into integer indices for neural network input Explanation

Explanation

c: To remove stop words permanently from the dataset Explanation

Explanation

d: To evaluate model accuracy Explanation

Explanation

3. Why are sequences padded to a fixed maximum length?

a: To increase vocabulary size Explanation

Explanation

b: To ensure all inputs have the same length for batch processing Explanation

Explanation

c: To improve GPU memory usage Explanation

Explanation

d: To convert labels into a binary format Explanation

Explanation

4. What role does the Embedding layer play in the models?

a: It converts probabilities into class labels Explanation

Explanation

b: It reduces the number of epochs required Explanation

Explanation

c: It performs sentiment classification directly Explanation

Explanation

d: It learns dense vector representations of words Explanation

Explanation

5. Why is binary cross-entropy used as the loss function in this experiment?

a: Because the dataset has multiple sentiment classes Explanation

Explanation

b: Because it reduces overfitting automatically Explanation

Explanation

c: Because it is suitable for binary classification problems Explanation

Explanation

d: Because it works only with RNNs Explanation

Explanation

6. What does a confusion matrix help evaluate in sentiment analysis?

a: Word frequency distribution Explanation

Explanation

b: Relationship between training and validation loss Explanation

Explanation

c: Correct and incorrect classification counts Explanation

Explanation

d: Sequence length distribution Explanation

Explanation

7. Why are ROC and Precision–Recall curves plotted in this experiment?

a: To visualize training time Explanation

Explanation

b: To analyse classifier performance across different thresholds Explanation

Explanation

c: To compare vocabulary sizes Explanation

Explanation

d: To determine batch size Explanation

Explanation

8. Why does the LSTM model generally perform better than a Simple RNN in this experiment?

a: LSTM has fewer parameters Explanation

Explanation

b: LSTM removes vanishing gradients completely Explanation

Explanation

c: LSTM requires less training data Explanation

Explanation

d: LSTM can capture long-term dependencies in text Explanation

Explanation

9. What does validation accuracy indicate during training?

a: The number of words correctly classified in the vocabulary Explanation

Explanation

b: Model performance on training data only Explanation

Explanation

c: Model generalization performance during training Explanation

Explanation

d: Vocabulary learning quality Explanation

Explanation

10. What is the final objective of comparing Simple RNN and LSTM models in this experiment?

a: To demonstrate the importance of long-term dependency modelling in sentiment analysis Explanation

Explanation

b: To prove that RNN is always better than LSTM Explanation

Explanation

c: To reduce dataset size Explanation

Explanation

d: To eliminate the need for preprocessing Explanation

Explanation