There's a major problem with GUIs as they exist today: a lack of composability, or the ability to connect multiple programs together or assemble larger blocks of functionality from the smaller ones you already have and understand. This is something the CLI gets right, and it's one of the big reasons people tout CLIs as having superior efficiency.
In a composable interface, I might only ever need to learn a few searching/scanning/data extraction tools really well. Instead of learning yet another search interface, or worse yet resorting to a program's builtin find-and-replace dialog, I could just plug my favorite tool in and use that. For a less text-oriented example, imagine taking a video lecture, filtering it through a video-processing program to extract the slides, and sending the images directly into another program to assemble the resulting images into a PDF. No saving intermediate files to disk or manually dragging things around, just wiring them together and letting the data go from one program to another. Finally, once you've got it perfect, it'd be trivial to make the whole chain into a new "tool" you could use later on - maybe right-click a Youtube link, run it through the "get video, filter, make PDF" chain, then print the output.
In most GUIs I've seen, these functions are all separate programs that can't really talk to each other aside from saving and loading files. There's certainly no easy way to package up a collection of programs, their options, and the connections as a new building block. Essentially, the problem is that applications are special - you can't connect them together to make a new application without programming and a lot of work.
CLIs are great at composability. Connecting the output of one program to the input of another is so fundamental to the Unix paradigm that there's a character dedicated to it. They're by no means perfect - discoverability is not one of the strong points - but the ability to put together small, easily-understood parts like Lego to get the functionality you want outweighs the drawbacks for a lot of people, myself included.
You could have a perfect GUI by all the standards Ross sets out in the video, but if it didn't solve this problem it'd still end up being necessarily inefficient for a large class of tasks. He mentioned in passing that he sometimes spends large amounts of time searching for and evaluating programs to accomplish a specific task; the fact that this is common enough to mention is a perfect example of what I mean. With a composable system it wouldn't go away entirely, but it'd be less frequent.
I don't have as much direct investment in improving GUIs as some people since I'm a developer and live in the terminal or browser 99% of the time, but I figured I'd contribute my two cents anyway. Besides, if there *is* a solution to GUI composability I wouldn't object to a nicer computing environment .