VideoSoftware: Optimisations: Promoted all s16s to s32's in TEV Modern x86 processors are really bad at dealing with 16 bit types. On a Core 2 cpu, any cache line containing a length changing prefix will have a decode penalty of 6 cycles. On Sandy bridge or later, each lenght changing prefix will add a 3 second penalty to decoding (and will be better on average).