| review | label |
|---|---|
| great movie! | ? |
| what a bunch of cr*p | ? |
| I lost all faith in humanity after watching this | ? |
Crash-course LLMs for Social Science
2025-09-12
Netflix recommendations
Drug Development (Source: Catacutan et al. 2024)
ChatGPT
We know stuff about some documents and want to know the same stuff about other documents.
| Term | Meaning |
|---|---|
| Classifier | a statistical model fitted to some data to make predictions about different data. |
| Training | The process of fitting the classifier to the data. |
| Train and test set | Datasets used to train and evaluate the classifier. |
| Vectorizer | A tool used to translate text into numbers. |

Statistical models can only read numbers
\(\rightarrow\) we need to translate!
| ID | Text |
|---|---|
| 1 | This is a text |
| 2 | This is no text |
| ID | This | is | a | text | no |
|---|---|---|---|---|---|
| 1 | 1 | 1 | 1 | 1 | 0 |
| 2 | 1 | 1 | 0 | 1 | 1 |
| review | label |
|---|---|
| great movie! | ? |
| what a bunch of cr*p | ? |
| I lost all faith in humanity after watching this | ? |
| FALSE | TRUE | |
|---|---|---|
| FALSE | 670 | 15 |
| TRUE | 34 | 281 |
![]()
| Term | Meaning |
|---|---|
| Accuracy | How much does it get right overall? |
| Recall | How much of the relevant cases does it find? |
| Precision | How many of the found cases are relevant? |
| F1 Score | Weighted average of precision and recall. |

\[MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2\]
Source: IBM
\[\hat{y} = g(w_0 + \sum_{i=1}^{m} w_i x_i)\]
Source: IBM
Important: non-linear! (why do you think that is?)
Who gets the best F1 score?