Safe Reinforcement Learning for 3D Large Scale Constrained Layout Optimization

December, 2025·

Jiyong Kim

Seokjun Kim

Sanghoon Jin

Yubin Lee

Namwoo Kang*

· 0 min read

Abstract

In this work, we propose a safe reinforcement learning framework for large-scale layout optimization under complex constraints. The problem is formulated as a constrained Markov decision process (CMDP), where the placement of objects is represented as hybrid actions consisting of continuous coordinates and discrete floor/rotation indices. To enforce constraints during training, we employ a constrained actor–critic architecture, in which the policy network is trained jointly with reward and cost critics. After reinforcement learning converges, the near-optimal layouts obtained from the safe reinforcement learning agent are further refined through fine-grained optimization using metaheuristic search, which improves objective values and constraint satisfaction. We expect that our two-stage approach effectively balances exploration and constraint enforcement, achieving competitive performance while reducing constraint violations in layout optimization tasks.

Type

Conference paper

Publication

Korean Society of Mechanical Engineers (KSME 2025)

Last updated on December, 2025

3D Layout Optimization Constrained Optimization NP-Hard Safe Reinforcement Learning

Authors

Jiyong Kim

Ph.D. student at KAIST

Point-based Diffusion Model for Predicting 2D Spatio-Temporal and 3D Large-Scale Physical Systems with Shape Variations April, 2025 →