We present a deep reinforcement learning approach to optimizing the performance of computation graphs in a static compiler. The key idea is to combine a neural network policy with a genetic algorithm, the Biased Random Key Genetic Algorithm (BRKGA). The policy is trained to predict, given an input graph to be optimized, the node-level probability distributions for sampling mutations and crossovers in BRKGA. Our approach, ``REINFORCE-based Genetic Algorithm Learning'' (REGAL), uses the policy's ability to transfer to new graphs to significantly improve the solution quality of the genetic algorithm for the same objective evaluation budget. As a concrete application, we show results for minimizing peak memory in TensorFlow graphs by jointly optimizing placement and scheduling. REGAL achieves on average 3.56\% lower peak memory than BRKGA on previously unseen graphs, outperforming all the algorithms we compare to, and giving $4.4\times$ bigger improvement than the next best algorithm. It captures on average 54.0\% of the estimated available headroom for improvement over BRKGA with minimal overhead. Our approach and analysis is made possible by collecting a dataset of 372 unique real-world TensorFlow graphs, more than an order of magnitude more data than previous work.