Build your own newsfeed

700 followers 167 artykułów/tydzień

[P] Improving Table Data Extraction from Financial Documents: Multimodal RAG with GPT-4o and Pathway

Hey r/MachineLearning, I'm sharing a showcase on how we improved RAG accuracy on documents with visual elements such as tables and plots by using GPT-4o in both the parsing and answering stages. It consists of several parts: Data indexing pipeline (incremental): We extract tables as images during the parsing process. GPT-4o explains the content of...

Sat May 18, 2024 00:45

[D] Machine Learning Engineers, what portion of your work is focused on deployment pipelines vs. model building/tuning?

I’m currently a machine learning engineer, but I focus much more heavily on the pipelines in a way that is similar to when I was a data engineer. I’d love to get more into the model building side of things, but my model knowledge has gotten a bit rusty since I finished my M.S. in Statistics. What portion of your day to day work is focused on deploying...

Fri May 17, 2024 21:45

[R] Seeking Advice on a Methodological Oversight: Navigating Errors and Seeking Clarity

Hey everyone, I am sharing a recent experience as a PhD student and seeking advice on interpreting a methodological oversight in our study. Upon reflection on a recent paper we submitted (I already submitted the camera-ready version), I realized we made a methodological error. Specifically, we used a validation set derived from the test set, potentially...

Fri May 17, 2024 21:45

[D] How are subspace embeddings different from basic dimensionality reduction?

I have been struggling to understand how more basic dimensionality reduction techniques differ from more advanced methods, mainly in whether the same intuition about subspaces, manifolds, etc. extends to the more basic methods. I understand how things like PCA, t-SNE, UMAP, etc etc work (and these are 90% of what comes up when looking for dimensionality...

Fri May 17, 2024 18:45

[D] Fundamentals of LoRA and low‑rank fine-tuning

Have you heard about LoRa? Do you know why it is important and how it could save GPUs? I am quite sure that you have, but do you know how it works? If you want to learn or refresh your knowledge, check out a new blog about LoRa (https://nebius.ai/blog/posts/fine-tuning/lora-low-rank-adaptation). There are many variations like QLORA, SLORA, etc. Which...

Fri May 17, 2024 18:45

[D] Weights for DiT fine-tuned on PubLayLet or DocLayNet

Hi all, I'm looking for weights (preferably a training checkpoint) for an object detection model built with DiT fine tuned to recognize document layouts. Something like the picture I attached: https://preview.redd.it/mculpv0l101d1.png?width=601&format=png&auto=webp&s=2db28aa8fa05c1480418f46956fce731fbba3a1f Thanks to anyone who can help...

Fri May 17, 2024 18:45

Zbuduj własny kanał informacyjny

Gotowy, by spróbować?
Rozpocznij 14-dniowy okres próbny, karta kredytowa nie jest wymagana.

Załóż konto