I'm a data-scientist at a small company (around 30 devs and 7 data-scientists, plus sales, marketing, management etc.). Our job is mainly classic tabular data-science stuff with a bit of geolocation data. Lots of statistics and some ML pipelines model training. After a little talk we had about using ChatGPT and Github Copilot my boss decided that in...
I've recently developed an xG (expected goals) model using event data, and I'm exploring the best methods for evaluating its accuracy. Given the nature of football, where goals are discrete (or if we look at each shot, it is a binary outcome) but my model predicts a continuous probability range (0,1). I'm curious about the most appropriate statistical...
This groundwork enables ecosystem players to consider deploying RAG solutions in real time without having to configure data retrieval systems. Link to Louis Brulé-Naudet's Hugging Face profile ```python import concurrent.futures import logging from datasets from tqdm import tqdm def dataset_loader( name:str, streaming:bool=True ) -> datasets.Dataset:...
There's a website posted here in r/ML where it's a website that compiles all of the best products suggested by each subreddit, for example, earphones, the AI website will list and rank the top models and brands of the best and reviewed products made by Redditors. I can't find the website for the life of me. submitted by /u/vertigondriac [link]...
It would be great if there were a bundle of such sources or if you have a go to place where you keep up to date with all the new research going on. submitted by /u/pontiac_RN [link] [comments]
You can find the paper here: https://arxiv.org/abs/2403.18671 Here is the list of things that you can find in the paper: - We reveal that large commercial language models cannot be used for every day fact checking tasks. - We argue that evaluating the fact checking pipeline across websites does not fully demonstrate model transferability, and instead,...