post-thumbnail

GPTScore: Evaluate as You Desire

GPTAnalytical AI to Generative AIlarge PLM + Prompt $\\rightarrow$ superior performanceneeds for evaluating the quality of these textsevaluating singl

2025년 1월 3일
·
0개의 댓글
·
post-thumbnail

SELF-EXPERTISE: Knowledge-Based Instruction Dataset Augmentation for a Legal Expert Language Model

Instruction Tuning DatasetInstruction Tuning is important for LLMsAuto generation method is unsuitable for some domains where the accuracy is importan

2025년 1월 3일
·
0개의 댓글
·
post-thumbnail

A Multi-Task Benchmark for Korean Legal Langhage Understanding and Judgement Prediction

Previous Legal Export SystemsUseful on certain areasDeep Learning Based ApproachLegal Judgement PredictionLegal Content GenerationLegal Text Classific

2025년 1월 2일
·
0개의 댓글
·
post-thumbnail

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

1. Introduction PLMs learn a substantial amount of in-depth knowledge from data it can't expand or revise their memory can't straightforward

2025년 1월 2일
·
0개의 댓글
·

Computational analysis of 140 years of US political speeches reveals more positive but increasingly polarized framing of immigration

Abstract 200K US congressional speeches + 5K presidential communications related to immigration from 1880 to the present political speech about immig

2024년 5월 27일
·
0개의 댓글
·
post-thumbnail

Elements of Worls Knowledge (EWoK)

Elements of Worls Knowledge (EWoK): A cognition-inspired framework for evaluating basic world knowledge in LMsLLM acquires a substantial amount knowle

2024년 5월 19일
·
0개의 댓글
·
post-thumbnail

LayerSkip : Enabling Early Exit Inference and Self-Speculative Decoding

LLM Accelerationsparsityquantizationhead pruningReducing the number of layers for each token by exiting early during inferenceSpeculative decodingmain

2024년 5월 15일
·
0개의 댓글
·
post-thumbnail

Octopus v4: Graph of Language Models

1. Introduction LLMs became very powerful and used in lots of fields Due to Llama 2 and 3, the open-source LLMs has seen significant growth use

2024년 5월 5일
·
0개의 댓글
·
post-thumbnail

Can large language models explore in-context?

1. Introduction In-context Learning $\rightarrow$ important emergent capability of LLM without updating the model parameter, LLM can solve variou

2024년 3월 30일
·
0개의 댓글
·
post-thumbnail

Training Neural Networks from Scratch with Parallel Low-Rank Adapters

SOTA models' complexity $\\rightarrow$ computation / memory / communication bandwidthLoRAquantizing model parametrosPrior work has been limited to fin

2024년 3월 24일
·
0개의 댓글
·
post-thumbnail

Is Cosine-Similarity of Embeddings Really About Similarity?

1. Introduction Discrete Entities are embedded to dense real-valued vectors word embedding for LLM recommender system The embedding vector

2024년 3월 15일
·
0개의 댓글
·
post-thumbnail

Beyond Language Models: Byte Models are Digital World Simulators

Deep Learning has focused on interpretable digital media files - text, images, audioText played central role in conveying human intelligence and has l

2024년 3월 9일
·
0개의 댓글
·
post-thumbnail

The Era of 1-bit LLMs: All LLMs are in 1.58 bits

Abstract BitNet paved the way for a new era of 1-bit LLMs BitNet b.58 has every parameter as a * tenary * {-1, 0, 1} matches a full-precision Tr

2024년 3월 4일
·
0개의 댓글
·
post-thumbnail

CoT Reasoning without Prompting

1. Introduction LLMs' reasoning capabilities are elicited by prompting techniques Few shot prompting with intermediate steps augmented demonstra

2024년 2월 23일
·
0개의 댓글
·
post-thumbnail

Self-Discover: LLMs Self-Compose Reasoning Structure

1. Introduction To enhance LLMs' capability to reason and solve complex problems via prompting Few-shot & Zero-shot CoT $\rightarrow$ how humans

2024년 2월 16일
·
0개의 댓글
·
post-thumbnail

Repeat After Me: Transformers are Better than State Space Models at Copying

..? 1. Introduction Transformers require $\Omega(L)$ memory and compute to predict the next token of a sequence of length $L$ (using Flash Attention!

2024년 2월 7일
·
0개의 댓글
·
post-thumbnail

Adaptation with Self-Evaluation to Improve Selective Prediction in LLMs

1. Introduction LLM is not guaranteed to be accurate for all queries Understanding which queries they are reliable for is important Selective Predict

2024년 2월 1일
·
1개의 댓글
·
post-thumbnail

Spotting LLMs with Binoculars: Zero-Shot Detection of Machine-Generated Text

Intruducing a method for detecting LLM-generated text using zero-shot setting (No training sample from LLM source) outperforms all models with ChatGPT

2024년 1월 26일
·
0개의 댓글
·
post-thumbnail

Sparse Upcycling: Training MoE from Dense Checkpoints

1. Introduciton Increased Scale is one of the main drivers of better performancd in DL (NLP, Vision, Speech, RL, Multimodal etc.) Most SOTA Neural Net

2024년 1월 16일
·
0개의 댓글
·
post-thumbnail

SOLAR 10.7B: Scaling LLMs with Simple yet Effective Depth Up-Scaling

1. Introduction Recent LLMs scaling with performance scaling law $\rightarrow$ MoE Often require non-trivial changes to the training and inferen

2024년 1월 9일
·
0개의 댓글
·