Naval engagements, though rare, present complex challenges for data-driven machine learning due to their intricate dynamics and the scarcity of empirical evidence.These conflicts are well-represented within the framework of Partially Observable Stochastic Games (POSG), which models the adversarial interactions between contending forces through decision-making agents, possible states, actions, observations, and probabilistic state transitions. This research delineates the implementation of Multi-Agent Reinforcement Learning (MARL) algorithms, particularly Double Deep Q-Networks (DDQN) and Proximal Policy Optimization (PPO), in navigating the complexities inherent in strategic naval operations. Despite the operational challenges encountered, the findings of this study underscore the effectiveness of MARL in formulating and assessing tactical strategies. This contributes substantially to the enhancement of tactical planning and the introduction of novel strategic paradigms. Significantly, this investigation illuminates the transformative potential of MARL in naval strategy and decision-making processes, asserting its pivotal role in contemporary warfare analysis. This study not only confirms the applicability of MARL in complex scenarios but also highlights its capacity to revolutionize traditional approaches to military strategy. |
*** Title, author list and abstract as seen in the Camera-Ready version of the paper that was provided to Conference Committee. Small changes that may have occurred during processing by Springer may not appear in this window.