논문 리뷰

1.[논문 리뷰] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

post-thumbnail

2.[논문 리뷰] Dual-Channel Deepfake Audio Detection : Leveraging Direct and Reverberant Waveforms

post-thumbnail

3.[논문 리뷰] Attention Is All You Need

post-thumbnail

4.[논문 리뷰] Denoising Diffusion Probabilistic Models

post-thumbnail

5.[논문 리뷰] Link Prediction Based on Graph Neural Networks

post-thumbnail

6.[논문 리뷰] GNNExplainer: Generating Explanations for Graph Neural Networks

post-thumbnail

7.[논문 리뷰] Hierarchical Graph Representation Learning with Differentiable Pooling

post-thumbnail

8.[논문 리뷰] SuperGlue : Learning Feature Matching with Graph Neural Networks

post-thumbnail

9.[논문 리뷰] Revisiting Deep Learning Models for Tabular Data

post-thumbnail

10.[논문 리뷰] T2G-FORMER: Organizing Tabular Features into Relation Graphs Promotes Heterogeneous Feature Interaction

post-thumbnail

11.[논문 리뷰] Why do tree-based models still outperform deep learning on typical tabular data?

post-thumbnail

12.[논문 리뷰] TabTransformer: Tabular Data Modeling Using Contextual Embeddings

post-thumbnail

13.[논문 리뷰] Large Scale Transfer Learning for Tabular Data via Language Modeling

post-thumbnail

14.[논문 리뷰] Large Language Models on Tabular Data - A Survey

post-thumbnail

15.[논문 리뷰] Binning as a Pretext Task: Improving Self-Supervised Learning in Tabular Domains

post-thumbnail

16.[논문 리뷰] Representation Space Augmentation for Effective Self-Supervised Learning on Tabular Data

post-thumbnail

17.[논문 리뷰] AGATa: Attention-Guided Augmentation for Tabular Data in Contrastive Learning

post-thumbnail

18.[논문 리뷰] TabGLM: Tabular Graph Language Model for Learning Transferable Representations Through Multi-Modal Consistency Minimization

post-thumbnail

19.[논문 리뷰] MultiTab: A Scalable Foundation for Multitask Learning on Tabular Data

post-thumbnail

20.[논문 리뷰] Audio Description Generation in the Era of LLMs and VLMs: A Review of Transferable Generative AI Technologies

post-thumbnail

21.[논문 리뷰] BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

post-thumbnail

22.[논문 리뷰] DANTE-AD: Dual-Vision Attention Network for Long-Term Audio Description

post-thumbnail

23.[논문 리뷰] MMAD: Multi-modal Movie Audio Description

post-thumbnail

24.[논문 리뷰] Qianfan-VL: Domain-Enhanced Universal Vision-Language Models

post-thumbnail

25.[논문 리뷰] ERNIE4.5 Technical Report

post-thumbnail

26.[논문 리뷰] An Image Is Worth 16×16 Words: Transformers for Image Recognition at Scale

post-thumbnail

27.[논문 리뷰] Learning Transferable Visual Models From Natural Language Supervision

post-thumbnail

28.[논문 리뷰] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

post-thumbnail

29.[논문 리뷰] Visual Instruction Tuning

post-thumbnail

30.[논문 리뷰] V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs

post-thumbnail

31.[논문 리뷰] Efficient Semantic Uncertainty Quantification in Language Models via Diversity-Steered Sampling

post-thumbnail

32.[논문 리뷰] ARC Is a Vision Problem!

post-thumbnail

33.[논문 리뷰] Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail

post-thumbnail

34.[논문 리뷰] Adaptive Action Chunking at Inference-time for Vision-Language-Action Models

post-thumbnail

35.[논문 리뷰] A1: A Fully Transparent Open-Source, Adaptive and Efficient Truncated Vision-Language-Action Model

post-thumbnail

36.[논문 리뷰] ViVa: A Video-Generative Value Model for Robot Reinforcement Learning

post-thumbnail

37.[논문 리뷰] RT-1: ROBOTICS TRANSFORMER FOR REAL-WORLD CONTROL AT SCALE

post-thumbnail

38.[논문 리뷰] RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

post-thumbnail

39.[논문 리뷰] Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

post-thumbnail

40.[논문 리뷰] Octo: An Open-Source Generalist Robot Policy

post-thumbnail

41.[논문 리뷰] OpenVLA: An Open-Source Vision-Language-Action Model

post-thumbnail

42.[논문 리뷰] Green-VLA: Staged Vision-Language-Action Model for Generalist Robots

post-thumbnail

43.[논문 리뷰] VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model

post-thumbnail

44.[논문 리뷰] OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

post-thumbnail

45.[논문 리뷰] LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

post-thumbnail

46.[논문 리뷰] DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off

post-thumbnail

47.[논문 리뷰] Near-Future Policy Optimization

post-thumbnail

48.[논문 리뷰] PersonaVLM: Long-Term Personalized Multimodal LLMs

post-thumbnail

49.[논문 리뷰] Long-Horizon Manipulation via Trace-Conditioned VLA Planning

post-thumbnail

50.[논문 리뷰] CodeGraphVLP: Code-as-Planner Meets Semantic-Graph State for Non-Markovian Vision-Language-Action Models

post-thumbnail

51.[논문 리뷰] VLA-RFT: VISION-LANGUAGE-ACTION REINFORCEMENT FINE-TUNING WITH VERIFIED REWARDS IN WORLD SIMULATORS

post-thumbnail

52.[논문 리뷰] SPATIAL FORCING: IMPLICIT SPATIAL REPRESENTATION ALIGNMENT FOR VISION-LANGUAGE-ACTION MODEL

post-thumbnail

53.[논문 리뷰] NATURE-INSPIRED POPULATION-BASED EVOLUTION OF LARGE LANGUAGE MODELS

post-thumbnail