GPTAnalytical AI to Generative AIlarge PLM + Prompt $\\rightarrow$ superior performanceneeds for evaluating the quality of these textsevaluating singl
Instruction Tuning DatasetInstruction Tuning is important for LLMsAuto generation method is unsuitable for some domains where the accuracy is importan
Previous Legal Export SystemsUseful on certain areasDeep Learning Based ApproachLegal Judgement PredictionLegal Content GenerationLegal Text Classific
1. Introduction PLMs learn a substantial amount of in-depth knowledge from data it can't expand or revise their memory can't straightforward
Abstract 200K US congressional speeches + 5K presidential communications related to immigration from 1880 to the present political speech about immig
Elements of Worls Knowledge (EWoK): A cognition-inspired framework for evaluating basic world knowledge in LMsLLM acquires a substantial amount knowle
LLM Accelerationsparsityquantizationhead pruningReducing the number of layers for each token by exiting early during inferenceSpeculative decodingmain
1. Introduction LLMs became very powerful and used in lots of fields Due to Llama 2 and 3, the open-source LLMs has seen significant growth use
1. Introduction In-context Learning $\rightarrow$ important emergent capability of LLM without updating the model parameter, LLM can solve variou
SOTA models' complexity $\\rightarrow$ computation / memory / communication bandwidthLoRAquantizing model parametrosPrior work has been limited to fin
1. Introduction Discrete Entities are embedded to dense real-valued vectors word embedding for LLM recommender system The embedding vector
Deep Learning has focused on interpretable digital media files - text, images, audioText played central role in conveying human intelligence and has l
Abstract BitNet paved the way for a new era of 1-bit LLMs BitNet b.58 has every parameter as a * tenary * {-1, 0, 1} matches a full-precision Tr
1. Introduction LLMs' reasoning capabilities are elicited by prompting techniques Few shot prompting with intermediate steps augmented demonstra
1. Introduction To enhance LLMs' capability to reason and solve complex problems via prompting Few-shot & Zero-shot CoT $\rightarrow$ how humans
..? 1. Introduction Transformers require $\Omega(L)$ memory and compute to predict the next token of a sequence of length $L$ (using Flash Attention!
1. Introduction LLM is not guaranteed to be accurate for all queries Understanding which queries they are reliable for is important Selective Predict
Intruducing a method for detecting LLM-generated text using zero-shot setting (No training sample from LLM source) outperforms all models with ChatGPT
1. Introduciton Increased Scale is one of the main drivers of better performancd in DL (NLP, Vision, Speech, RL, Multimodal etc.) Most SOTA Neural Net
1. Introduction Recent LLMs scaling with performance scaling law $\rightarrow$ MoE Often require non-trivial changes to the training and inferen