What open-source datasets are there that would also have a good sentence-encoder on hugging face?
Last active 5 months ago
5 replies
20 views
- GE
What open-source datasets are there that would also have a good sentence-encoder on hugging face?
- JO
Try out https://huggingface.co/datasets/agnews with distilbert-base-uncased. It's my go-to combination for development/trying out some things π Generally, however, I go for a reduced set. agnews can be huge, so I randomly sample 3k. You can pick that example from our sample projects in the application.
- JO
That combination is even already available in our sample projects, including the encoding.
- GE
@mention it was just this mini piece https://medium.com/@george.pearse/vector-databases-for-data-centric-ai-part-2-ba995053ce05
- JO
Awesome, really cool demo showcase! π
And major thanks for the shout-out in the article, that caught me by surprise and put a smile on my face π
Last active 5 months ago
5 replies
20 views