oswinso.xyz Open in urlscan Pro
185.199.108.153  Public Scan

Submitted URL: http://oswinso.xyz/
Effective URL: https://oswinso.xyz/
Submission: On January 16 via api from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

Toggle navigation
   
 * about (current)
   
 * CV
   
 * blog
   


OSWIN SO



1ST YEAR GRAD STUDENT @ REALM, MIT AEROASTRO

I’m Oswin So, a 1st year grad student in REALM at MIT, advised by Chuchu Fan.
Previously, I did my undergrad at Georgia Tech, where I was very fortunate to do
undergraduate researcher with Evangelos Theodorou and Molei Tao.

Last summer, I interned at Toyota Research Institute where I worked on game
theoretic planning. Previously, I worked at Aurora as a Behavior Planning Intern
during the summer of 2021 under Paul Vernaza and Arun Venkatraman, working on
cost function learning via on-policy negative examples for autonomous driving.

See my full CV here (updated November 2021).

Contact: oswinso [at] mit [dot] edu
Follow: Google Scholar | LinkedIn | oswinso | @oswinso




NEWS

Jul 2023

My first paper after joining MIT as a grad student on combining Deep RL and
optimal control to synthesize safe, stabilizing controllers has been accepted to
RSS 2023! Check out the project page for cool visualizations and videos.

Jan 2022

Two of my recent works Multimodal Maximum Entropy Dynamic Games and
Decentralized Safe Multi-agent Stochastic Optimal Control using Deep FBSDEs and
ADMM were recently submitted to RSS 2022 and are currently under review! Check
them out.

Also, Maximum Entropy Differential Dynamic Programming has just been accepted to
ICRA 2022!

Sep 2021

Check out our most recent work Maximum Entropy Differential Dynamic Programming
currently under review for ICRA 2021!

May 2021

Variational Inference MPC using Tsallis Divergence has been accepted to RSS
2021!

Mar 2021

Adaptive Risk Sensitive Model Predictive Control with Stochastic Search has been
accepted to L4DC 2021!


PUBLICATIONS


2023

 1. RSS 2023
    Solving Stabilize-Avoid Optimal Control via Epigraph Form and Deep
    Reinforcement Learning
    Oswin So, and Chuchu Fan
    Robotics: Science and Systems , 2023
    Website arXiv Cite Abstract
    
    Tasks for autonomous robotic systems commonly require stabilization to a
    desired region while maintaining safety specifications. However, solving
    this multi-objective problem is challenging when the dynamics are nonlinear
    and high-dimensional, as traditional methods do not scale well and are often
    limited to specific problem structures. To address this issue, we propose a
    novel approach to solve the stabilize-avoid problem via the solution of an
    infinite-horizon constrained optimal control problem (OCP). We transform the
    constrained OCP into epigraph form and obtain a two-stage optimization
    problem that optimizes over the policy in the inner problem and over an
    auxiliary variable in the outer problem. We then propose a new method for
    this formulation that combines an on-policy deep reinforcement learning
    algorithm with neural network regression. Our method yields better stability
    during training, avoids instabilities caused by saddle-point finding, and is
    not restricted to specific requirements on the problem structure compared to
    more traditional methods. We validate our approach on different benchmark
    tasks, ranging from low-dimensional toy examples to an F16 fighter jet with
    a 17-dimensional state space. Simulation results show that our approach
    consistently yields controllers that match or exceed the safety of existing
    methods while providing ten-fold increases in stability performance from
    larger regions of attraction.
    
    @inproceedings{so2023solving,
      title={Solving Stabilize-Avoid Optimal Control via Epigraph Form and Deep
    Reinforcement Learning},
      author={So, Oswin and Fan, Chuchu},
      booktitle={Robotics: Science and Systems},
      year={2023},
    }

 2. ICRA 2023
    MPOGames: Efficient Multimodal Partially Observable Dynamic Games
    Oswin So, Paul Drews, Thomas Balch, Velin Dimitrov, Guy Rosman, and
    Evangelos A. Theodorou
    2023 IEEE International Conference on Robotics and Automation (ICRA)
    arXiv Cite Abstract
    
    Game theoretic methods have become popular for planning and prediction in
    situations involving rich multi-agent interactions. However, these methods
    often assume the existence of a single local Nash equilibria and are hence
    unable to handle uncertainty in the intentions of different agents. While
    maximum entropy (MaxEnt) dynamic games try to address this issue, practical
    approaches solve for MaxEnt Nash equilibria using linear-quadratic
    approximations which are restricted to unimodal responses and unsuitable for
    scenarios with multiple local Nash equilibria. By reformulating the problem
    as a POMDP, we propose MPOGames, a method for efficiently solving MaxEnt
    dynamic games that captures the interactions between local Nash equilibria.
    We show the importance of uncertainty-aware game theoretic methods via a
    two-agent merge case study. Finally, we prove the real-time capabilities of
    our approach with hardware experiments on a 1/10th scale car platform.
    
    @inproceedings{so2022mpogames,
      title={MPOGames: Efficient Multimodal Partially Observable Dynamic Games},
      author={So, Oswin and Drews, Paul and Balch, Thomas and Dimitrov, Velin
    and Rosman, Guy and Theodorou, Evangelos A.},
      booktitle={2023 IEEE International Conference on Robotics and Automation
    (ICRA)},
      year={2023},
    }


2022

 1. ML4PS 2022
    Data-driven discovery of non-Newtonian astronomy via learning non-Euclidean
    Hamiltonian
    Oswin So, Gongjie Li, Evangelos A Theodorou, and Molei Tao
    Machine Learning and the Physical Sciences Workshop NeurIPS , 2022
    arXiv Cite Abstract
    
    Incorporating the Hamiltonian structure of physical dynamics into deep
    learning models provides a powerful way to improve the interpretability and
    prediction accuracy. While previous works are mostly limited to the
    Euclidean spaces, their extension to the Lie group manifold is needed when
    rotations form a key component of the dynamics, such as the higher-order
    physics beyond simple point-mass dynamics for N-body celestial interactions.
    Moreover, the multiscale nature of these processes presents a challenge to
    existing methods as a long time horizon is required. By leveraging a
    symplectic Lie-group manifold preserving integrator, we present a method for
    data-driven discovery of non-Newtonian astronomy. Preliminary results show
    the importance of both these properties in training stability and prediction
    accuracy.
    
    @inproceedings{so2022data,
      title={Data-driven discovery of non-Newtonian astronomy via learning
    non-Euclidean Hamiltonian},
      author={So, Oswin and Li, Gongjie and Theodorou, Evangelos A and Tao,
    Molei},
      booktitle={Machine Learning and the Physical Sciences Workshop NeurIPS},
      year={2022},
    }

 2. NeurIPS 2022
    Deep Generalized Schrodinger Bridge
    Guan-Horng Liu, Tianrong Chen*, Oswin So*, and Evangelos A Theodorou
    Thirty-Sixth Conference on Neural Information Processing Systems , 2022
    arXiv Cite Abstract
    
    Mean-Field Game (MFG) serves as a crucial mathematical framework in modeling
    the collective behavior of individual agents interacting stochastically with
    a large population. In this work, we aim at solving a challenging class of
    MFGs in which the differentiability of these interacting preferences may not
    be available to the solver, and the population is urged to converge exactly
    to some desired distribution. These setups are, despite being well-motivated
    for practical purposes, complicated enough to paralyze most (deep) numerical
    solvers. Nevertheless, we show that Schrödinger Bridge - as an
    entropy-regularized optimal transport model - can be generalized to
    accepting mean-field structures, hence solving these MFGs. This is achieved
    via the application of Forward-Backward Stochastic Differential Equations
    theory, which, intriguingly, leads to a computational framework with a
    similar structure to Temporal Difference learning. As such, it opens up
    novel algorithmic connections to Deep Reinforcement Learning that we
    leverage to facilitate practical training. We show that our proposed
    objective function provides necessary and sufficient conditions to the
    mean-field problem. Our method, named Deep Generalized Schrödinger Bridge
    (DeepGSB), not only outperforms prior methods in solving classical
    population navigation MFGs, but is also capable of solving 1000-dimensional
    opinion depolarization, setting a new state-of-the-art numerical solver for
    high-dimensional MFGs. Our code will be made available at
    https://github.com/ghliu/DeepGSB.
    
    @inproceedings{liu2022deep,
      title={Deep Generalized Schrodinger Bridge},
      author={Liu, Guan-Horng and Chen*, Tianrong and So*, Oswin and Theodorou,
    Evangelos A},
      booktitle={Thirty-Sixth Conference on Neural Information Processing
    Systems},
      year={2022},
    }

 3. Multimodal Maximum Entropy Dynamic Games
    Oswin So, Kyle Stachowicz, and Evangelos A. Theodorou
    arXiv preprint (in submission)
    arXiv Cite Abstract Video
    
    Environments with multi-agent interactions often result a rich set of
    modalities of behavior between agents due to the inherent suboptimality of
    decision making processes when agents settle for satisfactory decisions.
    However, existing algorithms for solving these dynamic games are strictly
    unimodal and fail to capture the intricate multimodal behaviors of the
    agents. In this paper, we propose MMELQGames (Multimodal Maximum-Entropy
    Linear Quadratic Games), a novel constrained multimodal maximum entropy
    formulation of the Differential Dynamic Programming algorithm for solving
    generalized Nash equilibria. By formulating the problem as a certain dynamic
    game with incomplete and asymmetric information where agents are uncertain
    about the cost and dynamics of the game itself, the proposed method is able
    to reason about multiple local generalized Nash equilibria, enforce
    constraints with the Augmented Lagrangian framework and also perform
    Bayesian inference on the latent mode from past observations. We assess the
    efficacy of the proposed algorithm on two illustrative examples: multi-agent
    collision avoidance and autonomous racing. In particular, we show that only
    MMELQGames is able to effectively block a rear vehicle when given a speed
    disadvantage and the rear vehicle can overtake from multiple positions.
    
    @article{so2022multimodal,
      title={Multimodal Maximum Entropy Dynamic Games},
      author={So, Oswin and Stachowicz, Kyle and Theodorou, Evangelos A.},
      journal={arXiv preprint (in submission)},
      year={2022},
    }

 4. RSS 2022
    Decentralized Safe Multi-agent Stochastic Optimal Control using Deep FBSDEs
    and ADMM
    Marcus A. Pereira, Augustinos D. Saravanos, Oswin So, and Evangelos A.
    Theodorou
    Robotics: Science and Systems , 2022
    arXiv Cite Abstract Video
    
    In this work, we propose a novel safe and scalable decentralized solution
    for multi-agent control in the presence of stochastic disturbances. Safety
    is mathematically encoded using stochastic control barrier functions and
    safe controls are computed by solving quadratic programs. Decentralization
    is achieved by augmenting to each agent’s optimization variables, copy
    variables, for its neighboring agents. This allows us to decouple the
    centralized multi-agent optimization problem. How- ever, to ensure safety,
    neighboring agents must agree on what is safe for both of us and this
    creates a need for consensus. To enable safe consensus solutions, we
    incorporate an ADMM- based approach. Specifically, we propose a Merged
    CADMM- OSQP implicit neural network layer, that solves a mini-batch of both,
    local quadratic programs as well as the overall con- sensus problem, as a
    single optimization problem. This layer is embedded within a Deep FBSDEs
    network architecture at every time step, to facilitate end-to-end
    differentiable, safe and decentralized stochastic optimal control. The
    efficacy of the proposed approach is demonstrated on several challenging
    multi- robot tasks in simulation. By imposing requirements on safety
    specified by collision avoidance constraints, the safe operation of all
    agents is ensured during the entire training process. We also demonstrate
    superior scalability in terms of computational and memory savings as
    compared to a centralized approach.
    
    @inproceedings{pereira2022decentralized,
      title={Decentralized Safe Multi-agent Stochastic Optimal Control using
    Deep FBSDEs and ADMM},
      author={Pereira, Marcus A. and Saravanos, Augustinos D. and So, Oswin and
    Theodorou, Evangelos A.},
      booktitle={Robotics: Science and Systems},
      year={2022},
    }


2021

 1. ICRA 2022
    Maximum Entropy Differential Dynamic Programming
    Oswin So, Ziyi Wang, and Evangelos A. Theodorou
    2022 International Conference on Robotics and Automation (ICRA)
    arXiv Cite Abstract Video
    
    In this paper, we present a novel maximum entropy formulation of the
    Differential Dynamic Programming algorithm and derive two variants using
    unimodal and multimodal value functions parameterizations. By combining the
    maximum entropy Bellman equations with a particular approximation of the
    cost function, we are able to obtain a new formulation of Differential
    Dynamic Programming which is able to escape from local minima via
    exploration with a multimodal policy. To demonstrate the efficacy of the
    proposed algorithm, we provide experimental results using four systems on
    tasks that are represented by cost functions with multiple local minima and
    compare them against vanilla Differential Dynamic Programming. Furthermore,
    we discuss connections with previous work on the linearly solvable
    stochastic control framework and its extensions in relation to
    compositionality.
    
    @inproceedings{so2021maximum,
      title={Maximum Entropy Differential Dynamic Programming},
      author={So, Oswin and Wang, Ziyi and Theodorou, Evangelos A.},
      booktitle={2022 International Conference on Robotics and Automation
    (ICRA)},
      year={2021},
    }

 2. RSS 2021
    Variational Inference MPC using Tsallis Divergence
    Ziyi Wang*, Oswin So*, Jason Gibson, Bogdan Vlahov, Manan S Gandhi,
    Guan-Horng Liu, and Evangelos A Theodorou
    Robotics: Science and Systems , 2021
    arXiv Cite Abstract
    
    In this paper, we provide a generalized framework for Variational
    Inference-Stochastic Optimal Control by using thenon-extensive Tsallis
    divergence. By incorporating the deformed exponential function into the
    optimality likelihood function, a novel Tsallis Variational Inference-Model
    Predictive Control algorithm is derived, which includes prior works such as
    Variational Inference-Model Predictive Control, Model Predictive
    PathIntegral Control, Cross Entropy Method, and Stein VariationalInference
    Model Predictive Control as special cases. The proposed algorithm allows for
    effective control of the cost/reward transform and is characterized by
    superior performance in terms of mean and variance reduction of the
    associated cost. The aforementioned features are supported by a theoretical
    and numerical analysis on the level of risk sensitivity of the proposed
    algorithm as well as simulation experiments on 5 different robotic systems
    with 3 different policy parameterizations.
    
    @inproceedings{wang2021variational,
      title={Variational Inference MPC using Tsallis Divergence},
      author={Wang*, Ziyi and So*, Oswin and Gibson, Jason and Vlahov, Bogdan
    and Gandhi, Manan S and Liu, Guan-Horng and Theodorou, Evangelos A},
      booktitle={Robotics: Science and Systems},
      year={2021},
    }

 3. Spatio-Temporal Differential Dynamic Programming for Control of Fields
    Ethan N Evans, Oswin So, Andrew P Kendall, Guan-Horng Liu, and Evangelos A
    Theodorou
    arXiv preprint (in submission)
    arXiv Cite Abstract
    
    We consider the optimal control problem of a general nonlinear
    spatio-temporal system described by Partial Differential Equations (PDEs).
    Theory and algorithms for control of spatio-temporal systems are of rising
    interest among the automatic control community and exhibit numerous
    challenging characteristic from a control standpoint. Recent methods focus
    on finite-dimensional optimization techniques of a discretized finite
    dimensional ODE approximation of the infinite dimensional PDE system. In
    this paper, we derive a differential dynamic programming (DDP) framework for
    distributed and boundary control of spatio-temporal systems in infinite
    dimensions that is shown to generalize both the spatio-temporal LQR
    solution, and modern finite dimensional DDP frameworks. We analyze the
    convergence behavior and provide a proof of global convergence for the
    resulting system of continuous-time forward-backward equations. We explore
    and develop numerical approaches to handle sensitivities that arise during
    implementation, and apply the resulting STDDP algorithm to a linear and
    nonlinear spatio-temporal PDE system. Our framework is derived in infinite
    dimensional Hilbert spaces, and represents a discretization-agnostic
    framework for control of nonlinear spatio-temporal PDE systems.
    
    @article{evans2021spatio,
      title={Spatio-Temporal Differential Dynamic Programming for Control of
    Fields},
      author={Evans, Ethan N and So, Oswin and Kendall, Andrew P and Liu,
    Guan-Horng and Theodorou, Evangelos A},
      journal={arXiv preprint (in submission)},
      year={2021},
    }

 4. L4DC 2021
    Adaptive Risk Sensitive Model Predictive Control with Stochastic Search
    Ziyi Wang, Oswin So, Keuntaek Lee, and Evangelos A. Theodorou
    Learning for Dynamics & Control Conference , 2021
    arXiv Cite Abstract
    
    We present a general framework for optimizing the Conditional Value-at-Risk
    for dynamical systems using stochastic search. The framework is capable of
    handling the uncertainty from the initial condition, stochastic dynamics,
    and uncertain parameters in the model. The algorithm is compared against a
    risk-sensitive distributional reinforcement learning framework and
    demonstrates outperformance on a pendulum and cartpole with stochastic
    dynamics. We also showcase the applicability of the framework to robotics as
    an adaptive risk-sensitive controller by optimizing with respect to the
    fully nonlinear belief provided by a particle filter on a pendulum,
    cartpole, and quadcopter in simulation.
    
    @inproceedings{wang2021adaptive,
      title={Adaptive Risk Sensitive Model Predictive Control with Stochastic
    Search},
      author={Wang, Ziyi and So, Oswin and Lee, Keuntaek and Theodorou,
    Evangelos A.},
      booktitle={Learning for Dynamics & Control Conference},
      year={2021},
    }

© Copyright 2023 Oswin So. Last updated: September 30, 2023.