Ravi D'Elia

he who is Ravi, mindgiver of Max in glory

archive | github | contact

On Shell Infrastructure

On paper, I have been using Linux for several years, ever since I built my first PC in 2016 and didn't feel like paying for a windows key. In practice, my on and off again journey with understanding Linux has borne fruit only starting in April (of 2020, if I haven't added in automatic dating in the future). One day I just sat down and reinstalled Manjaro, which I had broken in a way that I could now likely fix, and that was that. I was once again a Linux user.

However, I had installed and reinstalled Linux in various forms many times since 2016. Why did this time stick? Well I was stuck in quarantine for one, and needed a hobby. For another I was high off my success getting a minecraft server up and running on a raspberry pi I had lying around. But more than anything the thing that made me want to understand the world I had once again dived into was the missing semester of your cs education, an online course that dives into the how of using a Unix system to its full potential.

What truly drew me in was the section on data wrangling, a worked example in using Unix pipes and standard tools to manipulate data into usable form. I loved the way having all data conveyed by plaintext allowed a carefree daisychaining of commands to accomplish any goal, and combined with the rest of the practical base-level knowledge I was missing my latest Linux attempt was simply able to go much further than my previous. I completely fell in love, and haven't looked back since.

The Honeymoon Ends

Sadly, nothing lasts forever. Though I love and loved hacking together bash scripts, I had increasing issues with some aspects of how bash, and POSIX shells in general, work. The absolute biggest issue is the agonizing complexity of expansion. I have read the docs far too many times, and I still regularly make simple mistakes with expansion. Even now I have to use hacky workarounds for what should be really simple passthroughs.

Making this worse is the textual nature of the input and output. On one hand, this is absolutely what makes the magic happen. On the other, passing data around in the same format as the code doing the passing inevitably leads to layers on layers of complexity to no benefit. The struggles I went through writing a script that takes a command as input were bad enough; when that inputed command was a script to open a command in a terminal, it became nearly impossible. Subshells on subshells, multistage expansion, POSIX complaint shell can get unmanageable quickly.

Much of this complexity could be avoided, however, by a stronger typing system. In my ideal world, type information would simply be a third standard pipe, along with stdin and stderr. In the worst case, it too could be encoded in text. Whichever it is, being able to assign types to variables would enormously simplify bash scripts. Particularly for newbies like me, a strongly typed alternative to standard bash would be super helpful.

Possible fixes

On reading that, you probably had a lot of ideas on what the better option would be. Lets run through some:

Some projects are trying to build something like this. In fact, reading through old hacker news posts is what brought this subject to mind:

These are good ideas, and frankly I think they're doing God's work. However, they are still in active development. In time I hope we see something like this take off, but unless a big distro puts some time into making it work we'll never see widespread adoption. In the mean time though, many of these problems can be solved with a little bash infrastructure.

Shell Infrastructure: We're finally talking about it

That's right, we're finally to the topic of this whole post. Sue me, I'm new this. I firmly believe that many of these issues can be almost completely mitigated with tooling that uses the bash that's already in place. As much as I've complained about it, bash wasn't a game changer for me for nothing. These tools are the spiritual successors to text manipulation tools like awk and sed and handy scripting assistants like xargs and read, in that they don't perform a final action. Instead, they serve as infrastructure and utilities, allowing for more adaptable and idiomatic scripts. These tools are the things that make the Unix philosophy work, and in the modern world there is no reason that new tools can't keep following it.

Like I said at the beginning, I'm still a bash beginner, so I've probably missed a world of helpful tools. I do still think that this is a somewhat underserved field, and encourage everyone to contribute to it. I plan on writing some helpful tools of my own, but these are what I've found:

These tools help translate normal Unix shell into something a little more manipulatable. On the other side of the coin, xargs manages to cover most use cases. Though this isn't quite enough to completely eliminate the need for a fresher, more modern shell, tools like these make the process a whole lot simpler. The main issue, of course, remains the nightmare that is expansion. Luckily a great deal of those issues can be resolved by having data in more modern, manipulatable formats, since chaining together combinations of stdin and arguments isn't as necessary, but that isn't a complete solution on its own. If you know of a solution, let me know.