... exercise in Pandas, I am trying to analyse the best strategies for wordle. I found a list of 5 letter words and I am trying to create a dataFrame with the words and the number of each letter these words have.
I have manage to do this using this code:
import pandas as pd
import numpy as np
import string
words_df = pd.read_csv('https://raw.githubusercontent.com/charlesreid1/five-letter-words/master/sgb-words.txt', sep="\n", header=None)
words_df.rename(columns={0:'words'}, inplace=True)
alphabet_string = string.ascii_lowercase
alphabet_list = list(alphabet_string)
for letter in alphabet_list:
words_df[letter] = 0
letter_count_list = []
for row_index, word in enumerate(words_df['words']):
letter_count_list.append([])
for column_index, letter in enumerate(alphabet_list):
letter_count = word.count(letter)
letter_count_list[row_index].append(letter_count)
words_df.iloc[:,1:] = letter_count_list
words_df
I get exactly what I want, which is the following dataframe:
words a b c d e f g h i j k l m n o p q r s t u v w x y z
0 which 0 0 1 0 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0
1 there 0 0 0 0 2 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0
2 their 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0
3 about 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0
4 would 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 1 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
5752 osier 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0
5753 roble 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0
5754 rumba 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0
5755 biffy 0 1 0 0 0 2 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
5756 pupal 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 2 0 0 0 0 1 0 0 0 0 0
The code works perfectly, however, my main goal here is to learn Pandas and I know that using a for loop is not the best way to achieve this result. I have tried using the .apply() method, but I could not get it to work. What is the "pandas way" of achieving the same result?
Extra: I know that my Python skills are not the best, I would also appreciate comments on my code.
