I'm trying to classify the sentences of a specific column into three labels with . The problem is that the output of the model is "sentence + the three labels + the scores for each label. Output example:
What I need is to get a single column with only the label with the highest score, in this case .
Any feedback which can help me to do it? Right now my code looks like:
df_test = df.sample(frac = 0.0025)
classifier = pipeline("zero-shot-classification",
model="facebook/bart-large-mnli")
sequence_to_classify = df_test["full_description"]
candidate_labels = ['senior', 'middle', 'junior']
df_test["seniority_label"] = df_test.apply(lambda x: classifier(x.full_description, candidate_labels, multi_label=True,), axis=1)
df_test.to_csv("Seniority_Classified_SampleTest.csv")
(Using a Sample of the df for testing code).
And the code I've followed comes from this web, where they do receive a column with labels as an output idk how: https://practicaldatascience.co.uk/machine-learning/how-to-classify-customer-service-emails-with-bart-mnli
