https://machinelearningmastery.com/skip-connections-in-transformer-models/