-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dealing with cold start users click history #11
Comments
Here is an update on what I did. inside the
where
This may be useful for other people who are trying to solve the same problem. |
Hi @igor17400, Thanks for raising this issue. Indeed, the original code did not work with empty user histories, but implementing this functionality should be useful for many users. I think your solution is simple and elegant. I can have a look at it over the weekend, to test it with both pretrained word embeddings and PLMs, and streamline it across the data preprocessing functions for all datasets. Would you like to open a PR with your proposed solution? |
Hi @andreeaiana, I'm in the process of implementing PP-Rec, as outlined in PR #12. I'm currently working through it, ensuring that the blocks are accurate and checking the scores and behaviors for MIND large and Adressa. Thus, is not ready to be merge. However, just to let you know that in this PR, I'm adding the |
Great, thanks for letting me know and for your contributions to the library. |
@andreeaiana I noticed you filter out cold start users (those with empty histories). Why is that? I'm wondering if it might be better to use a strategy like the one I previously mentioned (_initialize_cold_start) to pre-populate these cold start users with some placeholder news articles, rather than removing them. But maybe my thinking is wrong. |
I think that's a good idea, we can try it. I know that some models originally do that, but not all of them. |
Hello! I'm currently handling a dataset where the
histories
column might initially be empty, especially for users who are accessing the system for the first time.Given this context, I'm seeking advice on how to approach a particular situation highlighted in the code found at this GitHub link. The process involves tokenizing the titles of previously clicked news articles, but I'm facing a potential cold start issue for new users without any history. In these instances, should I consider tokenizing empty titles, abstracts, etc.?
The text was updated successfully, but these errors were encountered: