Hey r/MachineLearning, I'm sharing a showcase on how we improved RAG accuracy on documents with visual elements such as tables and plots by using GPT-4o in both the parsing and answering stages. It consists of several parts: Data indexing pipeline (incremental): We extract tables as images during the parsing process. GPT-4o explains the content of...
I’m currently a machine learning engineer, but I focus much more heavily on the pipelines in a way that is similar to when I was a data engineer. I’d love to get more into the model building side of things, but my model knowledge has gotten a bit rusty since I finished my M.S. in Statistics. What portion of your day to day work is focused on deploying...
Hey everyone, I am sharing a recent experience as a PhD student and seeking advice on interpreting a methodological oversight in our study. Upon reflection on a recent paper we submitted (I already submitted the camera-ready version), I realized we made a methodological error. Specifically, we used a validation set derived from the test set, potentially...
I have been struggling to understand how more basic dimensionality reduction techniques differ from more advanced methods, mainly in whether the same intuition about subspaces, manifolds, etc. extends to the more basic methods. I understand how things like PCA, t-SNE, UMAP, etc etc work (and these are 90% of what comes up when looking for dimensionality...
Have you heard about LoRa? Do you know why it is important and how it could save GPUs? I am quite sure that you have, but do you know how it works? If you want to learn or refresh your knowledge, check out a new blog about LoRa (https://nebius.ai/blog/posts/fine-tuning/lora-low-rank-adaptation). There are many variations like QLORA, SLORA, etc. Which...
Hi all, I'm looking for weights (preferably a training checkpoint) for an object detection model built with DiT fine tuned to recognize document layouts. Something like the picture I attached: https://preview.redd.it/mculpv0l101d1.png?width=601&format=png&auto=webp&s=2db28aa8fa05c1480418f46956fce731fbba3a1f Thanks to anyone who can help...
Zbuduj własny kanał informacyjny
Gotowy, by spróbować?
Rozpocznij 14-dniowy okres próbny, karta kredytowa nie jest wymagana.