1k followers 226 articles/week
[R] Efficiency and Maintainability in Named Entity Recognition: A Trie-based Knowledge Base Approach

Hey r/machinelearning! I'm new here and recently wrote an article titled "Efficiency and Maintainability in Named Entity Recognition: A Trie-based Knowledge Base Approach" where I discuss a trie-based knowledge base approach for Named Entity Recognition (NER) models. I wanted to share it with you all and get your opinions and insights! Summary: In the...

Wed May 31, 2023 12:33
[D] What is meant by parameters when LLMs and other NNs size is the argument?

Already few years ago I remember articles reporting about huge neural networks trained by researcher at Google, the size was measured by counting the number of parameters. Back then I thought that they were referring to the weights of the NNs. Now if you see the Wikipedia description of ChatGPT it says that the largest model of version 3 has 175 Billion...

Wed May 31, 2023 12:23
[R] Fine-Tuning Language Models with Just Forward Passes

This paper presents a memory-efficient zeroth-order optimizer (MeZO) for fine-tuning language models (LMs). As LMs grow larger, backpropagation becomes computationally costly, requiring large amounts of memory. MeZO adapts the classical Zeroth-order Stochastic Gradient Descent (ZO-SGD) method to operate in-place, enabling fine-tuning of LMs with the...

Wed May 31, 2023 12:04
[N] (Update: Code Released) Landmark Attention: Random-Access Infinite Context Length for Transformers

Code for Landmark Attention is now released and it should be possible to finetune existing LLaMA models using this method. https://github.com/epfml/landmark-attention Paper: https://arxiv.org/abs/2305.16300 The paper introduces a new method called Landmark Attention that addresses the memory limitations of transformers when dealing with longer contexts....

Wed May 31, 2023 12:04
Trend anomaly detection [D]

Hi, May I get a suggestion on what machine learning model or statistical approach can I use to identify data points as anomalies that are gradually decreasing or increasing in time series data. Thank you in advance submitted by /u/SeaworthinessGlad975 [link] [comments]

Wed May 31, 2023 11:54
[D] (very) few data

Hello ML friends :) So I'm building a machine learning model for an expensive experiment, the problem as you may have guessed is lack of enough data. I only have the data of 17 previous experiments, with 7 independent variables and 1 dependent variable in each one. My question to you is: How to deal with very important and yet very few data? I think...

Wed May 31, 2023 10:24

Build your own newsfeed

Ready to give it a go?
Start a 14-day trial, no credit card required.

Create account