654 followers 173 artikelen/week
[D] Real talk about RAG

Let’s be honest here. I know we all have to deal with these managers/directors/CXOs that come up with amazing idea to talk with the company data and documents. But… has anyone actually done something truly useful? If so, how was its usefulness measured? I have a feeling that we are being fooled by some very elaborate bs as the LLM can always generate...

Sat Apr 27, 2024 21:34
[P] BLEU scores not improving

I am working on this Image Captioning project using CNN+LSTM. Currently I am using googleNet as CNN and lstm, bilstm as RNN. For embedding I am using word2vec algorithm. Dataset used is flickr8k. As you can see from the blue scores, they are very close i.e no noticeable improvement. Parameter values in both of these approaches are same. The values are...

Sat Apr 27, 2024 21:34
[D] But what does a trained Convolution Neural Network actually learn? Visualized!

Sharing a video from my YT channel explaining convolution and visualizing how kernels are learnt… enjoy! submitted by /u/AvvYaa [link] [comments]

Sat Apr 27, 2024 18:34
[P] Classification finetuning experiments on small GPT-2 sized LLMs

I ran a few classification finetuning experiments on relatively "small" experiments that I found interesting and wanted to share: Model Weights Trainable token Trainable layers Context length CPU/GPU Training time Training acc Validation acc Test acc 1 gpt2-small (124M) pretrained last last_block longest train ex. (120) V100 0.39 min 96.63% 97.99%...

Sat Apr 27, 2024 15:34
[R] Transfer learning in environmental data-driven models

Brand new paper published in Environmental Modelling & Software. We investigate the possibility of training a model in a data-rich site and reusing it without retraining or tuning in a new (data-scarce) site. The concepts of transferability matrix and transferability indicators have been introduced. Check out more here: https://www.researchgate.net/publication/380113869_Transfer_learning_in_environmental_data-driven_models_A_study_of_ozone_forecast_in_the_Alpine_region...

Sat Apr 27, 2024 15:34
[D] Llama-3 based OpenBioLLM-70B & 8B: Outperforms GPT-4, Gemini, Meditron-70B, Med-PaLM-1 & Med-PaLM-2 in Medical-domain

Open Source Strikes Again, We are thrilled to announce the release of OpenBioLLM-Llama3-70B & 8B. These models outperform industry giants like Openai’s GPT-4, Google’s Gemini, Meditron-70B, Google’s Med-PaLM-1, and Med-PaLM-2 in the biomedical domain, setting a new state-of-the-art for models of their size. The most capable openly available Medical-domain...

Sat Apr 27, 2024 15:34

Bouw uw eigen nieuws-stroom

Klaar om het te proberen?
Start een 14-daagse proef, geen credit card nodig.

Account aanmaken