HomeHacker News Hierarchical transformers are more efficient language models byManawasalwa •November 04, 2021 0 Hierarchical transformers are more efficient language models Tags: Hacker News Facebook Twitter