(23S) [EE488(D)] Introduction to Reinforcement Learning - Project

구명규·2023년 5월 30일

Lecture Project

Lecture Notes

목록 보기

5/5

1. Problem Statement

Environment Definition

[Outline]

Rocket Recycling
1) 로켓이 목표 지점에서 바로 선 채 머무르도록 하는 Hovering
2) 로켓이 지정된 착륙 지점에서 안전하게 착륙하도록 하는 Landing

[Environment Settings]

Earth gravity

Air resistance proportional to the velocity

State Space

{ $x, y, v_x, v_y, \theta, v_\theta, t, \phi$ } $\in R^8$

Rocket Position(Initially 500m above the ground) & Speed(initially -50m/s)

Rocket Angle(initially 90deg) & Angle Velocity

Time

Nozzle Angle [-20deg, +20deg]

Action Space

{ $f, w_\phi$ }

Thrust acceleration: { 0.2g, 1.0g, 2.0g } (discrete)

Angular velocity of nozzle: {0, -30deg/s, +30deg/s} (discrete)

Reward

HOVERING
Rocket Position이 predefined target point와 가까울수록,
Rocket Angle이 0에 가까울수록 높은 reward
LANDING
착지 시 Rocket Position이 predefined landing point와 가까울수록,
Rocket Speed가 safe threshold보다 작고,
Rocket Angle이 0에 가까울수록 높은 reward