Stack-based virtual machines (VMs) are being employed with embedded systems for their space-efficient instruction format. Bytecode can be compressed to further reduce the memory footprint. A technique introduced by Latendresse et al. (Sci Comput Program 57:295–317, 2005)  decodes Huffman-compressed bytecode without a prior decompression step of the program as a whole. Instead, the VM instruction dispatch determines the next opcode from a sequence of tablelookups from the compressed bytecode stream. In this paper we identify indirect branches as a major performance bottleneck of the Latendresse Huffman decoder. We show conclusively that the heuristics of CPU branch predictors are in-effective with Just-in-Time (JIT) Huffman decoding, and we provide a revised decoder where indirect branches have been eliminated. We experimentally evaluate our proposed method as a stand-alone decoder and as part of the instruction dispatch of TinyVM (Hong et al. in Softw Pract Exp 42:1193–1209, 2012) . A representative selection of benchmarks from the MiBench suite (Guthaus et al. IEEE international workshop on workload characterization, WWC-4, pp. 3–14, 2001)  showed improvements between 20 and 35% in overall interpreter performance.