oswinso.xyz
Open in
urlscan Pro
185.199.108.153
Public Scan
Submitted URL: http://oswinso.xyz/
Effective URL: https://oswinso.xyz/
Submission: On January 16 via api from US — Scanned from DE
Effective URL: https://oswinso.xyz/
Submission: On January 16 via api from US — Scanned from DE
Form analysis
0 forms found in the DOMText Content
Toggle navigation * about (current) * CV * blog OSWIN SO 1ST YEAR GRAD STUDENT @ REALM, MIT AEROASTRO I’m Oswin So, a 1st year grad student in REALM at MIT, advised by Chuchu Fan. Previously, I did my undergrad at Georgia Tech, where I was very fortunate to do undergraduate researcher with Evangelos Theodorou and Molei Tao. Last summer, I interned at Toyota Research Institute where I worked on game theoretic planning. Previously, I worked at Aurora as a Behavior Planning Intern during the summer of 2021 under Paul Vernaza and Arun Venkatraman, working on cost function learning via on-policy negative examples for autonomous driving. See my full CV here (updated November 2021). Contact: oswinso [at] mit [dot] edu Follow: Google Scholar | LinkedIn | oswinso | @oswinso NEWS Jul 2023 My first paper after joining MIT as a grad student on combining Deep RL and optimal control to synthesize safe, stabilizing controllers has been accepted to RSS 2023! Check out the project page for cool visualizations and videos. Jan 2022 Two of my recent works Multimodal Maximum Entropy Dynamic Games and Decentralized Safe Multi-agent Stochastic Optimal Control using Deep FBSDEs and ADMM were recently submitted to RSS 2022 and are currently under review! Check them out. Also, Maximum Entropy Differential Dynamic Programming has just been accepted to ICRA 2022! Sep 2021 Check out our most recent work Maximum Entropy Differential Dynamic Programming currently under review for ICRA 2021! May 2021 Variational Inference MPC using Tsallis Divergence has been accepted to RSS 2021! Mar 2021 Adaptive Risk Sensitive Model Predictive Control with Stochastic Search has been accepted to L4DC 2021! PUBLICATIONS 2023 1. RSS 2023 Solving Stabilize-Avoid Optimal Control via Epigraph Form and Deep Reinforcement Learning Oswin So, and Chuchu Fan Robotics: Science and Systems , 2023 Website arXiv Cite Abstract Tasks for autonomous robotic systems commonly require stabilization to a desired region while maintaining safety specifications. However, solving this multi-objective problem is challenging when the dynamics are nonlinear and high-dimensional, as traditional methods do not scale well and are often limited to specific problem structures. To address this issue, we propose a novel approach to solve the stabilize-avoid problem via the solution of an infinite-horizon constrained optimal control problem (OCP). We transform the constrained OCP into epigraph form and obtain a two-stage optimization problem that optimizes over the policy in the inner problem and over an auxiliary variable in the outer problem. We then propose a new method for this formulation that combines an on-policy deep reinforcement learning algorithm with neural network regression. Our method yields better stability during training, avoids instabilities caused by saddle-point finding, and is not restricted to specific requirements on the problem structure compared to more traditional methods. We validate our approach on different benchmark tasks, ranging from low-dimensional toy examples to an F16 fighter jet with a 17-dimensional state space. Simulation results show that our approach consistently yields controllers that match or exceed the safety of existing methods while providing ten-fold increases in stability performance from larger regions of attraction. @inproceedings{so2023solving, title={Solving Stabilize-Avoid Optimal Control via Epigraph Form and Deep Reinforcement Learning}, author={So, Oswin and Fan, Chuchu}, booktitle={Robotics: Science and Systems}, year={2023}, } 2. ICRA 2023 MPOGames: Efficient Multimodal Partially Observable Dynamic Games Oswin So, Paul Drews, Thomas Balch, Velin Dimitrov, Guy Rosman, and Evangelos A. Theodorou 2023 IEEE International Conference on Robotics and Automation (ICRA) arXiv Cite Abstract Game theoretic methods have become popular for planning and prediction in situations involving rich multi-agent interactions. However, these methods often assume the existence of a single local Nash equilibria and are hence unable to handle uncertainty in the intentions of different agents. While maximum entropy (MaxEnt) dynamic games try to address this issue, practical approaches solve for MaxEnt Nash equilibria using linear-quadratic approximations which are restricted to unimodal responses and unsuitable for scenarios with multiple local Nash equilibria. By reformulating the problem as a POMDP, we propose MPOGames, a method for efficiently solving MaxEnt dynamic games that captures the interactions between local Nash equilibria. We show the importance of uncertainty-aware game theoretic methods via a two-agent merge case study. Finally, we prove the real-time capabilities of our approach with hardware experiments on a 1/10th scale car platform. @inproceedings{so2022mpogames, title={MPOGames: Efficient Multimodal Partially Observable Dynamic Games}, author={So, Oswin and Drews, Paul and Balch, Thomas and Dimitrov, Velin and Rosman, Guy and Theodorou, Evangelos A.}, booktitle={2023 IEEE International Conference on Robotics and Automation (ICRA)}, year={2023}, } 2022 1. ML4PS 2022 Data-driven discovery of non-Newtonian astronomy via learning non-Euclidean Hamiltonian Oswin So, Gongjie Li, Evangelos A Theodorou, and Molei Tao Machine Learning and the Physical Sciences Workshop NeurIPS , 2022 arXiv Cite Abstract Incorporating the Hamiltonian structure of physical dynamics into deep learning models provides a powerful way to improve the interpretability and prediction accuracy. While previous works are mostly limited to the Euclidean spaces, their extension to the Lie group manifold is needed when rotations form a key component of the dynamics, such as the higher-order physics beyond simple point-mass dynamics for N-body celestial interactions. Moreover, the multiscale nature of these processes presents a challenge to existing methods as a long time horizon is required. By leveraging a symplectic Lie-group manifold preserving integrator, we present a method for data-driven discovery of non-Newtonian astronomy. Preliminary results show the importance of both these properties in training stability and prediction accuracy. @inproceedings{so2022data, title={Data-driven discovery of non-Newtonian astronomy via learning non-Euclidean Hamiltonian}, author={So, Oswin and Li, Gongjie and Theodorou, Evangelos A and Tao, Molei}, booktitle={Machine Learning and the Physical Sciences Workshop NeurIPS}, year={2022}, } 2. NeurIPS 2022 Deep Generalized Schrodinger Bridge Guan-Horng Liu, Tianrong Chen*, Oswin So*, and Evangelos A Theodorou Thirty-Sixth Conference on Neural Information Processing Systems , 2022 arXiv Cite Abstract Mean-Field Game (MFG) serves as a crucial mathematical framework in modeling the collective behavior of individual agents interacting stochastically with a large population. In this work, we aim at solving a challenging class of MFGs in which the differentiability of these interacting preferences may not be available to the solver, and the population is urged to converge exactly to some desired distribution. These setups are, despite being well-motivated for practical purposes, complicated enough to paralyze most (deep) numerical solvers. Nevertheless, we show that Schrödinger Bridge - as an entropy-regularized optimal transport model - can be generalized to accepting mean-field structures, hence solving these MFGs. This is achieved via the application of Forward-Backward Stochastic Differential Equations theory, which, intriguingly, leads to a computational framework with a similar structure to Temporal Difference learning. As such, it opens up novel algorithmic connections to Deep Reinforcement Learning that we leverage to facilitate practical training. We show that our proposed objective function provides necessary and sufficient conditions to the mean-field problem. Our method, named Deep Generalized Schrödinger Bridge (DeepGSB), not only outperforms prior methods in solving classical population navigation MFGs, but is also capable of solving 1000-dimensional opinion depolarization, setting a new state-of-the-art numerical solver for high-dimensional MFGs. Our code will be made available at https://github.com/ghliu/DeepGSB. @inproceedings{liu2022deep, title={Deep Generalized Schrodinger Bridge}, author={Liu, Guan-Horng and Chen*, Tianrong and So*, Oswin and Theodorou, Evangelos A}, booktitle={Thirty-Sixth Conference on Neural Information Processing Systems}, year={2022}, } 3. Multimodal Maximum Entropy Dynamic Games Oswin So, Kyle Stachowicz, and Evangelos A. Theodorou arXiv preprint (in submission) arXiv Cite Abstract Video Environments with multi-agent interactions often result a rich set of modalities of behavior between agents due to the inherent suboptimality of decision making processes when agents settle for satisfactory decisions. However, existing algorithms for solving these dynamic games are strictly unimodal and fail to capture the intricate multimodal behaviors of the agents. In this paper, we propose MMELQGames (Multimodal Maximum-Entropy Linear Quadratic Games), a novel constrained multimodal maximum entropy formulation of the Differential Dynamic Programming algorithm for solving generalized Nash equilibria. By formulating the problem as a certain dynamic game with incomplete and asymmetric information where agents are uncertain about the cost and dynamics of the game itself, the proposed method is able to reason about multiple local generalized Nash equilibria, enforce constraints with the Augmented Lagrangian framework and also perform Bayesian inference on the latent mode from past observations. We assess the efficacy of the proposed algorithm on two illustrative examples: multi-agent collision avoidance and autonomous racing. In particular, we show that only MMELQGames is able to effectively block a rear vehicle when given a speed disadvantage and the rear vehicle can overtake from multiple positions. @article{so2022multimodal, title={Multimodal Maximum Entropy Dynamic Games}, author={So, Oswin and Stachowicz, Kyle and Theodorou, Evangelos A.}, journal={arXiv preprint (in submission)}, year={2022}, } 4. RSS 2022 Decentralized Safe Multi-agent Stochastic Optimal Control using Deep FBSDEs and ADMM Marcus A. Pereira, Augustinos D. Saravanos, Oswin So, and Evangelos A. Theodorou Robotics: Science and Systems , 2022 arXiv Cite Abstract Video In this work, we propose a novel safe and scalable decentralized solution for multi-agent control in the presence of stochastic disturbances. Safety is mathematically encoded using stochastic control barrier functions and safe controls are computed by solving quadratic programs. Decentralization is achieved by augmenting to each agent’s optimization variables, copy variables, for its neighboring agents. This allows us to decouple the centralized multi-agent optimization problem. How- ever, to ensure safety, neighboring agents must agree on what is safe for both of us and this creates a need for consensus. To enable safe consensus solutions, we incorporate an ADMM- based approach. Specifically, we propose a Merged CADMM- OSQP implicit neural network layer, that solves a mini-batch of both, local quadratic programs as well as the overall con- sensus problem, as a single optimization problem. This layer is embedded within a Deep FBSDEs network architecture at every time step, to facilitate end-to-end differentiable, safe and decentralized stochastic optimal control. The efficacy of the proposed approach is demonstrated on several challenging multi- robot tasks in simulation. By imposing requirements on safety specified by collision avoidance constraints, the safe operation of all agents is ensured during the entire training process. We also demonstrate superior scalability in terms of computational and memory savings as compared to a centralized approach. @inproceedings{pereira2022decentralized, title={Decentralized Safe Multi-agent Stochastic Optimal Control using Deep FBSDEs and ADMM}, author={Pereira, Marcus A. and Saravanos, Augustinos D. and So, Oswin and Theodorou, Evangelos A.}, booktitle={Robotics: Science and Systems}, year={2022}, } 2021 1. ICRA 2022 Maximum Entropy Differential Dynamic Programming Oswin So, Ziyi Wang, and Evangelos A. Theodorou 2022 International Conference on Robotics and Automation (ICRA) arXiv Cite Abstract Video In this paper, we present a novel maximum entropy formulation of the Differential Dynamic Programming algorithm and derive two variants using unimodal and multimodal value functions parameterizations. By combining the maximum entropy Bellman equations with a particular approximation of the cost function, we are able to obtain a new formulation of Differential Dynamic Programming which is able to escape from local minima via exploration with a multimodal policy. To demonstrate the efficacy of the proposed algorithm, we provide experimental results using four systems on tasks that are represented by cost functions with multiple local minima and compare them against vanilla Differential Dynamic Programming. Furthermore, we discuss connections with previous work on the linearly solvable stochastic control framework and its extensions in relation to compositionality. @inproceedings{so2021maximum, title={Maximum Entropy Differential Dynamic Programming}, author={So, Oswin and Wang, Ziyi and Theodorou, Evangelos A.}, booktitle={2022 International Conference on Robotics and Automation (ICRA)}, year={2021}, } 2. RSS 2021 Variational Inference MPC using Tsallis Divergence Ziyi Wang*, Oswin So*, Jason Gibson, Bogdan Vlahov, Manan S Gandhi, Guan-Horng Liu, and Evangelos A Theodorou Robotics: Science and Systems , 2021 arXiv Cite Abstract In this paper, we provide a generalized framework for Variational Inference-Stochastic Optimal Control by using thenon-extensive Tsallis divergence. By incorporating the deformed exponential function into the optimality likelihood function, a novel Tsallis Variational Inference-Model Predictive Control algorithm is derived, which includes prior works such as Variational Inference-Model Predictive Control, Model Predictive PathIntegral Control, Cross Entropy Method, and Stein VariationalInference Model Predictive Control as special cases. The proposed algorithm allows for effective control of the cost/reward transform and is characterized by superior performance in terms of mean and variance reduction of the associated cost. The aforementioned features are supported by a theoretical and numerical analysis on the level of risk sensitivity of the proposed algorithm as well as simulation experiments on 5 different robotic systems with 3 different policy parameterizations. @inproceedings{wang2021variational, title={Variational Inference MPC using Tsallis Divergence}, author={Wang*, Ziyi and So*, Oswin and Gibson, Jason and Vlahov, Bogdan and Gandhi, Manan S and Liu, Guan-Horng and Theodorou, Evangelos A}, booktitle={Robotics: Science and Systems}, year={2021}, } 3. Spatio-Temporal Differential Dynamic Programming for Control of Fields Ethan N Evans, Oswin So, Andrew P Kendall, Guan-Horng Liu, and Evangelos A Theodorou arXiv preprint (in submission) arXiv Cite Abstract We consider the optimal control problem of a general nonlinear spatio-temporal system described by Partial Differential Equations (PDEs). Theory and algorithms for control of spatio-temporal systems are of rising interest among the automatic control community and exhibit numerous challenging characteristic from a control standpoint. Recent methods focus on finite-dimensional optimization techniques of a discretized finite dimensional ODE approximation of the infinite dimensional PDE system. In this paper, we derive a differential dynamic programming (DDP) framework for distributed and boundary control of spatio-temporal systems in infinite dimensions that is shown to generalize both the spatio-temporal LQR solution, and modern finite dimensional DDP frameworks. We analyze the convergence behavior and provide a proof of global convergence for the resulting system of continuous-time forward-backward equations. We explore and develop numerical approaches to handle sensitivities that arise during implementation, and apply the resulting STDDP algorithm to a linear and nonlinear spatio-temporal PDE system. Our framework is derived in infinite dimensional Hilbert spaces, and represents a discretization-agnostic framework for control of nonlinear spatio-temporal PDE systems. @article{evans2021spatio, title={Spatio-Temporal Differential Dynamic Programming for Control of Fields}, author={Evans, Ethan N and So, Oswin and Kendall, Andrew P and Liu, Guan-Horng and Theodorou, Evangelos A}, journal={arXiv preprint (in submission)}, year={2021}, } 4. L4DC 2021 Adaptive Risk Sensitive Model Predictive Control with Stochastic Search Ziyi Wang, Oswin So, Keuntaek Lee, and Evangelos A. Theodorou Learning for Dynamics & Control Conference , 2021 arXiv Cite Abstract We present a general framework for optimizing the Conditional Value-at-Risk for dynamical systems using stochastic search. The framework is capable of handling the uncertainty from the initial condition, stochastic dynamics, and uncertain parameters in the model. The algorithm is compared against a risk-sensitive distributional reinforcement learning framework and demonstrates outperformance on a pendulum and cartpole with stochastic dynamics. We also showcase the applicability of the framework to robotics as an adaptive risk-sensitive controller by optimizing with respect to the fully nonlinear belief provided by a particle filter on a pendulum, cartpole, and quadcopter in simulation. @inproceedings{wang2021adaptive, title={Adaptive Risk Sensitive Model Predictive Control with Stochastic Search}, author={Wang, Ziyi and So, Oswin and Lee, Keuntaek and Theodorou, Evangelos A.}, booktitle={Learning for Dynamics & Control Conference}, year={2021}, } © Copyright 2023 Oswin So. Last updated: September 30, 2023.