Is it mainly a matter of too many layer?
Or algorithms don't sufficiently flexible to be able to use at best the hardware?
Or both?
...