In active transport molecular communication (ATMC), information particles are actively transported from a transmitter to a receiver using special proteins. Prior work has demonstrated that ATMC can be an attractive and viable solution for on-chip applications. The energy consumption of an ATMC system plays a central role in its design and engineering. In this work, an energy model is presented for ATMC and this model is used to provide guidelines for designing energy efficient systems. The channel capacity per unit energy is analyzed and maximized. It is shown that based on the size of the symbol set and the symbol duration, there is a vesicle size that maximizes the rate per unit energy. It is also demonstrated that maximizing the rate per unit energy yields very different system parameters compared to maximizing the rate only.