Daily Python: Tweet Sentiment Analysis Using LSTM With PyTorch

Saturday, November 6, 2021

Tweet Sentiment Analysis Using LSTM With PyTorch

We will go through a common case study (sentiment analysis) to explore many techniques and patterns in Natural Language Processing.

Overview:

Imports and Data Loading
Data Preprocessing
- Null Value Removal
- Class Balance
Tokenization
Embeddings
LSTM Model Building
Setup and Training
Evaluation

Imports and Data Loading

In [81]:

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader, TensorDataset


import numpy as np
import pandas as pd

import re

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

import nltk
from nltk.tokenize import word_tokenize

import matplotlib.pyplot as plt

In [4]:

nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.

Out[4]:

True

This dataset can be found on Github in this repo: https://github.com/ajayshewale/Sentiment-Analysis-of-Text-Data-Tweets-

It is a sentiment analysis dataset comprised of 2 files:

train.csv, 5971 tweets
test.csv, 4000 tweets

The tweets are labeled as:

Positive
Neutral
Negative

Other datasets have different or more labels, but the same concepts apply to preprocessing and training. Download the files and store them locally.

In [7]:

train_path = "train.csv"
test_path = "test.csv"

Before working with PyTorch, make sure to set the

(continued...)

from Planet SciPy
read more

Daily Python

Saturday, November 6, 2021

Tweet Sentiment Analysis Using LSTM With PyTorch

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

Search This Blog