Nuclear fusion using magnetic confinement, in particular in the tokamak configuration, is a promising path towards sustainable energy. A core challenge is to shape and maintain a high temperature plasma within the tokamak vessel. This requires high dimensional, high frequency, closed-loop control using magnetic actuator coils, further complicated by the diverse requirements across a wide range of plasma configurations. In this work, we introduce a novel architecture for tokamak magnetic controller design that autonomously learns to command the full set of control coils. This architecture meets control objectives specified at a high level, while satisfying physical and operational constraints. This approach has unprecedented flexibility and generality in problem specification and yields a significant reduction in design effort to produce new plasma configurations. We successfully produce and control a diverse set of plasma configurations on the Tokamak à Configuration Variable (TCV) [1, 2], including elongated, conventional shapes, as well as advanced configurations, such as negative triangularity and “snowflake”configurations. Our approach achieves accurate tracking of the location, current, and shape for these configurations. We additionally demonstrate sustained “droplets” on TCV where two separate plasmas are maintained simultaneously within the vessel. This represents a significant advance for tokamak feedback control, showing the potential of reinforcement learning to accelerate research in the fusion domain, and is one of the most challenging real-world systems to which reinforcement learning has been applied.