665 followers 178 artículos/semana
[R] HGRN2: Gated Linear RNNs with State Expansion

Paper: https://arxiv.org/abs/2404.07904 Code: https://github.com/OpenNLPLab/HGRN2 Standalone code (1): https://github.com/Doraemonzzz/hgru2-pytorch Standalone code (2): https://github.com/sustcsonglin/flash-linear-attention/tree/main/fla/models/hgrn2 Abstract: Hierarchically gated linear RNN (HGRN, Qin et al. 2023) has demonstrated competitive training...

Fri May 3, 2024 15:35
[R] A Primer on the Inner Workings of Transformer-based Language Models

Authors: Javier Ferrando (UPC), Gabriele Sarti (RUG), Arianna Bisazza (RUG), Marta Costa-jussà (Meta) Paper: https://arxiv.org/abs/2405.00208 Abstract: The rapid progress of research aimed at interpreting the inner workings of advanced language models has highlighted a need for contextualizing the insights gained from years of work in this area....

Fri May 3, 2024 15:35
[D] Help: 1. Current PhD position is alright? 2. (3d) computer vision; point cloud processing, is my Research Roadmap correct?

Currently I am PhD student (in the middle of the second semester) = (almost 7 months), particularly I am focusing on point cloud research for classification and segmentation all on my own. no guidance from my prof or fella Ph.D. (s). I have tow particular questions: should I drop out my Ph.D. under current supervisor? why? because almost there is no...

Fri May 3, 2024 12:35
Let's talk about the difference between NLP and LLM [D]

NLP is like a team of specialists, each one good at a specific job, like translating languages or figuring out if a sentence sounds positive or negative. They're really good at what they do, but they only know how to do one thing. LLMs are like the superstars of the language world. They've been trained on tons of information, so they can do lots of...

Fri May 3, 2024 12:35
[D] Fine-tune Phi-3 model for domain specific data - seeking advice and insights

Hi, I am currently working on fine-tuning the Phi-3 model for financial data. While the loss is decreasing during training, suggesting that the model is learning quite well, the results on a custom benchmark are surprisingly poor. In fact, the accuracy has decreased compared to the base model. Results I've observed: Phi-3-mini-4k-instruct (base model):...

Fri May 3, 2024 12:35
[R] postive draws for bioDraws

I'm a beginner in python. Please help me with the following situation. My research is stuck. Consider the following equation in which have to generate random values (currently have set the method to NORMALMLHS). . L1 =c+sigmaL1 * bioDraws (E_L1','NORMAL_MLHS) . where L1 is an endogenous variable, c is an estimale constant for which the lower bound is...

Fri May 3, 2024 12:35

Crea tu propio feed de noticias

¿Listo para probarlo?
Comienza una prueba de 14 días, no es necesaria tarjeta de crédito.

Crear cuenta