New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add files via upload #497

Merged

MosheWasserb merged 5 commits into main from Sentiment_zeroshot

Mar 21, 2024

Collaborator

MosheWasserb commented Mar 5, 2024 •

edited

Loading

Adding a new notebook demonstrates Zero cost Zero time Zero shot Financial Sentiment Analysis
From GPT4/Mixtral to MLP128K with SetFit
@tomaarsen Could you also send to Moritz Laurer for review?


          Add files via upload

f4d5157

MosheWasserb requested a review from tomaarsen

March 5, 2024 20:48

review-notebook-app bot commented Mar 5, 2024

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

HuggingFaceDocBuilderDev commented Mar 5, 2024

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Member

tomaarsen commented Mar 5, 2024

Looks promising at a glance! cc @MoritzLaurer

MoritzLaurer reviewed

View reviewed changes

notebooks/Zero_cost_Zero_time_Zero_shot_Financial_Sentiment_Analysis.ipynb Outdated

    
            @@ -0,0 +1,2428 @@
          
              {

MoritzLaurer Mar 6, 2024 •

edited

Loading

would recommend versioning the key libraries to avoid issues with breaking changes in the future

Reply via ReviewNB

notebooks/Zero_cost_Zero_time_Zero_shot_Financial_Sentiment_Analysis.ipynb Outdated

    
            @@ -0,0 +1,2428 @@
          
              {

MoritzLaurer Mar 6, 2024 •

edited

Loading

provide a bit more context where this figure comes from and what it represents. i.e. zeroshot for the 3 generative LLMs and a fine-tuned RoBERTa based on the zeroshot synthetic data from Mixtral (CoT + SC) (1800~ data rows/texts)

Reply via ReviewNB

notebooks/Zero_cost_Zero_time_Zero_shot_Financial_Sentiment_Analysis.ipynb Outdated

    
            @@ -0,0 +1,2428 @@
          
              {

MoritzLaurer Mar 6, 2024 •

edited

Loading

use consistent terminology: pseudo labels or synthetic data (or better: explain that pseudo labels are synthetic data)

Reply via ReviewNB

notebooks/Zero_cost_Zero_time_Zero_shot_Financial_Sentiment_Analysis.ipynb Outdated

    
            @@ -0,0 +1,2428 @@
          
              {

MoritzLaurer Mar 6, 2024 •

edited

Loading

do you mean "download" instead of "upload"?

would also slightly reformulate to make it clear that by "skip the training step" you mean skipping the code two cells further down. (could maybe even add an if else to let people choose)

Reply via ReviewNB

notebooks/Zero_cost_Zero_time_Zero_shot_Financial_Sentiment_Analysis.ipynb Outdated

    
            @@ -0,0 +1,2428 @@
          
              {

MoritzLaurer Mar 6, 2024 •

edited

Loading

installs should probably all be at the very beginning

Reply via ReviewNB

notebooks/Zero_cost_Zero_time_Zero_shot_Financial_Sentiment_Analysis.ipynb Outdated

    
            @@ -0,0 +1,2428 @@
          
              {

MoritzLaurer Mar 6, 2024 •

edited

Loading

interesting, didn't know the WOS approach/metric. I suppose that word order is only one thing that transformers take into account. Another aspect would be semantic (dis)similarity of different strings. With a countvectorizer you only capture the exact words that are in the training corpus, but it can't capture semantically similar words that are outside the training data distribution

Reply via ReviewNB

notebooks/Zero_cost_Zero_time_Zero_shot_Financial_Sentiment_Analysis.ipynb Outdated

    
            @@ -0,0 +1,2428 @@
          
              {

MoritzLaurer Mar 6, 2024 •

edited

Loading

Line #11.    print('The WOS implies that in average {:0.1f}% of the sentences in the financial sentiment analysis (FSA) dataset are rather simple.\n'.format(100-100*WOS))

Reply via ReviewNB

notebooks/Zero_cost_Zero_time_Zero_shot_Financial_Sentiment_Analysis.ipynb Outdated

    
            @@ -0,0 +1,2428 @@
          
              {

MoritzLaurer Mar 6, 2024 •

edited

Loading

typo "prbabilities".

I would maybe also add a note somewhere that this is less likely to work on more complex reasoning tasks (I'd assume e.g. that countvectorizers just can't represent complex semantics / classes well enough)

Reply via ReviewNB

notebooks/Zero_cost_Zero_time_Zero_shot_Financial_Sentiment_Analysis.ipynb Outdated

    
            @@ -0,0 +1,2428 @@
          
              {

MoritzLaurer Mar 6, 2024 •

edited

Loading

interesting! (worth noting that you are also increasing the size of the MLP here, in addition to adding more training data. maybe make that (small) increase in size explicit)

Reply via ReviewNB

MoritzLaurer commented Mar 6, 2024

Looks interesting and good to me. Would assume that this works less well for more complex tasks and the additional step of distilling from the setfit model takes more developer time, but overall a cool approach for further compressing the model and making things much more efficient for inference

orenpereg added 2 commits

March 7, 2024 09:25


          embed Moritz remarks

c782a45


          minor update

d84883a

Collaborator Author

MosheWasserb commented Mar 8, 2024

Thanks @MoritzLaurer for the comments. We updated the notebook accordingly.
Would you be interested in a joint post/blog or expanding your original blog with the MLP example?

MoritzLaurer commented Mar 8, 2024

Thanks @MoritzLaurer for the comments. We updated the notebook accordingly. Would you be interested in a joint post/blog or expanding your original blog with the MLP example?

Great! Don't have bandwidth for a joint blog atm unfortunately.
Notebook LGTM @tomaarsen

orenpereg added 2 commits

March 11, 2024 02:21


          removed option to download model

d62be87


          update file name

04be278

Collaborator Author

MosheWasserb commented Mar 13, 2024

Hi @tomaarsen I think we are good to go and merge into main.
Would be great if you could also promote via LinkedIn.

Collaborator Author

MosheWasserb commented Mar 20, 2024

Hi @tomaarsen Could you merge into main?

MosheWasserb requested a review from orenpereg

March 21, 2024 08:27

MosheWasserb self-assigned this

MosheWasserb merged commit f97b3ad into main

6 of 18 checks passed

MosheWasserb deleted the Sentiment_zeroshot branch

March 21, 2024 08:30

Member

tomaarsen commented Mar 25, 2024

@MosheWasserb My apologies for the radio silence here, I was very busy with https://github.com/UKPLab/sentence-transformers/releases/tag/v2.6.0
Very impressive performance on this work.

Collaborator Author

MosheWasserb commented Mar 26, 2024 •

edited

Loading

Hi @tomaarsen Sure, no problem.
Great work with the binary embeddings.
Did you know that for SetFit I was able to compress a 768-vector size into 2 dim with no accuracy loss?

model after fine-tuning

X_train = model.encode(x_train)
X_eval = model.encode(x_eval)

PCA

estimator = PCA(n_components=2)
estimator.fit(X_train)

2D vectors

X_train_em = estimator.transform(X_train)
X_eval_em = estimator.transform(X_eval)

Logistic 2nd phase

sgd = LogisticRegression()
sgd.fit(X_train_em, y_train)
y_pred_eval_sgd = sgd.predict(X_eval_em)

Member

tomaarsen commented Mar 26, 2024

Thank you! PCA remains strong indeed, especially for classification. It doesn't work very well for retrieval however, there I've had more luck with 1. Matryoshka models and 2. Quantization to speed up the comparisons.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet