Day 1 -- Fluency In Use Of the Command Line
EVERY day is about the pre-requisites for the things that we will do on following days … at first, we just work on something that is extremely mundane, ie our fluency in the use of the command line.
We will spend an enormous amount of time working on other systems in the cloud … those systems will almost always involve working inside a shell of a development container provisioned with some sort of Linux machine image OR, at some point, we ourselves might want to provision our owndevelopment containers for others … even though some Unix purists might playfully chide us, saying that the more participatory, messy philosophy of Linux is not exactly the slightly elitist, definitely more stable philosophy of Unix they prefer. Linux is not exactly the original UNIX; but it is very UNIX-like and in some ways a better, more actively evolving development system or perhaps what we should think of as modern Unix.
We will definitely will need to understand some key features of the UNIX philosophy or what an ideal Unix architecture concept is supposed to be before we re-invent any old wheels:
- Unix systems are extremely durable, using a centralized operating system kernel written primarily in C language which manages system and process activities.
- All non-kernel software is organized into separate, kernel-managed processes, lightweight processes and threads. Understanding this is key for understanding development containers container orchestration systems … but also necessary for actually understanding Web Assembly.
- Unix systems are preemptively multitasking: multiple processes or threads can and do run at the same time, or within small time slices and nearly at the same time, and any process can be interrupted and moved out of execution by the kernel. Optimizing this thread management is essential for optimizing the performance of computation, which does not matter when we have 100X more compute power than we need – it’s a big deal when compute power limits what we can process.
- Files are stored on disk in a hierarchical file system, with a single top location throughout the system (root, or “/”), with both files and directories, subdirectories, sub-subdirectories, and so on below it. Finding our way around in this file structure is an essential prerequisite to doing anything.
- With few exceptions, devices and some types of communications between processes are managed and visible as files or pseudo-files within the file system hierarchy. This is known as UNIX’s “everything is a file” philosophy. However, Linus Torvalds is absolutely right in correcting this inaccuracy and pointing out why it’s best to think of “everything as a STREAM of bytes” rather than a file … so in order to understand where we are going it is essential to understand controlling the streams with pipelines.
Really UNDERSTAND pipelines and the general philosophy behind the architecture of Linux
In order to understand what the shell controls or how/why it is controlled in a certain manner, we will find that in order to be somewhat efficient [to be able to play well with others], we will often need to combine commands. Doing that involves thinking in Linux and understand where we are going and why fluency in the combinations more efficiently accomplishes our purpose.
Practically speaking, we should try to be more familiar with the bash shell and the more involved bash idioms … in order to understand motivations for modern commands to exploit the pipeline. We should back up a step, that familiarity also has to include being aware of the old, common trusted GNU Core Utilities OR the old reliable standard-issue [generally available in all UNIX distros] or common Unix tools. The modern tools may indeed be a lot better [although sometimes they are not] … but those tools will not always be provisioned in new container images.
Quite often, better is about speed in execution … often from applications written in Rust or maybe reimagined or refactored in C which opens up the ability to bring in features that offer some semblance intelligence, eg perhaps a neural network search … which drives much of the motivation behind the modern versions and improvements. Of course, much of the development activity is simply a matter of sharpening the old ax with more effective workflow design to eliminate pitfalls or problems of the common GNU Core Utilities or standard-issue [generally available in all distros] or common Unix tools. {NOTE to self: Learn RustLang, WASM and maybe try to level-up the old C skillset.}
Fluency In Use of The Command Line
What did Abe Lincoln say about the first thing in cutting down lots of trees … well, there’s simply no end to the amount of “sharpening the axe” that is NECESSARY for your personal sysadmin tools in information technology … get used to the fact that you simply can never know enough in this realm. It is now old news that software has eaten the world; you should also understand why data has replace capital and deep learning in a hybrid machine + human skill has become the dominant force in [self-]education.
Automating your intelligence gathering with personal sysadmin toools is fundamentally about time management … using machines effectively to give yourself enough time to relax and think … optimizing every second of every day.
This means that things like fluency on the command line has become absolutely essential prerequisite skill for all kinds of learning and communication nowadays. If you do ANY work in data wrangling OR the artificial intelligence of knowledge engineering OR Physics-informed neural networks, you will find that you simply need to be fluent in basic Unix commands in order to “make it to first base” or get into the game … without this command line fluency and ability to work in the world of containers, you will essentially be relegated to being able to only be a consumer of pre-digested information or the kind of pre-analyzed data which others feel like passing along [for their reasons]. Without this fluency, you are dependent – you will be spoon fed re-gurgitated information which has been dished up for the masses who are prized for their accounts, insurance coverage and luxuriant pecuniary pelts.
You will probably want to build upon rather than replace fluency in earlier, standard-issue common Unix commands and that means developing your own repository of notes on mastering the command line and possibly contribute to the [main originating repository](https://github.com/jlevy/the-art-of-command-line]. The point of developing your own selection of notes and tips on using the command-line is to more effectively be in control of your own tools by continually, work with intention to “SHARPEN the axe”.
The best way to develop fluency is “swing the axe” and get to work learning the HARD way … there’s no good substitute for actually USING the commands and doing more than simply copy and pasting commands that you come across in browsing StackOverflow, ServerFault, DevOps StackExchange or Unix/Linux StackExchange forums … of course, you really do need to KNOW what those commands are doing and YOU KNOW that will not be true the first time that you have use a command so, … work with intention to understand what you are doing … read and learn as much as you can beforehand and be sure to rely on things like ExplainShell to get a helpful breakdown of what commands, options, pipelines, etc might do – but mostly, you have to develop the fluency in the idiom to understand where you are going.
The modern tools will include things like:
- broot … is one fast application which illustrates the beauty of Rust dev and how open source dev communities work nowadays. It is a far better way to navigate directories, although the scope of how broot changes the game might seem at first to be somewhat intimidating and not just because how it replaces ls (and its clones) … you can get an overview of a directory, even a big one … and when you find a directory, you can cd right into it … you never have to lose track of file hierarchy while you search… you can manipulate your files or manage files with panels … preview files or apply a standard or personal command to a file or apply commands on several files or sort, see what takes space and check git statuses … but, you should spend some time looking at the full array of WHY you would want to master broot
- bat … a cat concatenate clone with syntax highlighting and Git integration,
- exa … exa is a modern replacement for ls … exa could be developed under a different set of assumptions than were around in the 1970s: a) computers are not the bottleneck, b) 256-colour terminals abound, c) old ls is still there
- lsd … ls deluxe … a rewrite of GNU ls with lots of added features like colors, icons, tree-view, more formatting options etc. The project is heavily inspired by the super colorls Ruby script project.
- delta … syntax-highlighting pager for Git, diff, and grep output
- dust … du + rust = dust … more intuitive disk utility … use deb-get to install dust.
- deb-get … an alternative to apt-get for software not (yet) officially packaged for Debian/Ubuntu, maybe software that is fast moving and newer versions are available from the vendor/project OR maybe you want to use some non-free software that Debian/Ubuntu cannot distribute due to licensing restrictions.
- duf … user-friendly disk usage utility with colorful output, that adjusts to your terminal’s theme & width, sorts results according to your needs to group and filter devices and conveniently output results in JSON format.
- fd,
- ripgrep … extremely fast alternative to simple grep which respects your gitignore … … of course, there’s a VS Code extension which utilizes ripgrep and fzf
- ag … code-searching tool similar to ack which was itself designed to go beyond simple grep, in being better for programmers with large heterogeneous trees of source code, ack is written in portable Perl 5 and takes advantage of the power of Perl’s regular expressions … since ag is implemented in C and uses pthreads, ag is an order of magnitude faster in some cases and getting even faster.
- [fzf] … general purpose command-line fuzzy finder which can be used with any list; files, command history, processes, hostnames, bookmarks, git commits, etc … of course, there’s a VS Code extension which utilizes ripgrep and fzf
- mcfly … replaces your default ctrl-r shell history search with an intelligent search engine that takes into account your working directory and the context of recently executed commands. The key point of interest in this Rust application iss smart command prioritization powered by a small neural network that runs fast in real time.
- choose … human-friendly and fast, ie written in Rust, alternative to cut, and (sometimes) awk, to extract sections from each line of input.
- jq … an application written in C that performs similarly to sed for JSON data to slice / filter / map / transform structured data with the same ease of sed, awk](https://en.wikipedia.org/wiki/AWK), grep and other text-processing utilities that are part of the set of common, standard-issue Unix commands. A jq program works as a “filter” to input and produce an output – there are a lot of builtin filters or filter components in jq for extracting a particular field of an object, or converting a number to a string, or various other standard tasks.
- jid … json incremental digger written in GoLang with suggestion and auto completion for simply, comfortably drilling down thru JSON interactively using filtering queries, such as jq
- jiq … jid on jq … just a simple interactive JSON query tool which invokes jq to use jqexpressions
- sd … another more intuitive application written in Rust which is an intuitive find and replace command line interface; sd uses regex syntax to focus on doing one thing and doing it well.
- cheat … allows for creation and viewing of interactive cheatsheets on the command-line. It was designed to help remind *nix system administrators of options for commands that they use frequently, but not frequently enough to remember.
- tldr … collection of community-maintained help pages for command-line tools or cheatsheets which aim to be a simpler, more approachable complement to traditional man pages
- bottom … sort of cross-platform graphical process/system monitor inspired by other tools like gtop, gotop, and htop. With Linux via WSL or WSL2, there can be problems with getting memory data, accessing temperature sensors or matching Windows’ own Task Manager in terms of data.
- gtop … system monitoring dashboard for the terminal command line implemented in javascipt and node.js
- vtop … another graphical activity monitor for the terminal command line, also implemented in javascipt and node.js but with a different dashboard interface
- gotop … yet **another terminal-based graphical activity monitor, inspired by gtop and vtop, but this time written in GoLang
- htop … a cross-platform interactive terminal-based graphical process viewer that is written in C using the ncurses “cursor optimization” library of functions which manage an application’s display on character-cell terminals
- [glances],
- [gtop],
- [hyperfine],
- [gping],
- [procs],
- [httpie],
- [curlie],
- [xh],
- [zoxide],
- dog … user-friendly command-line DNS client or DNS lookup utility, like dig on steroids
You will NEVER actually be able to finish this pre-requisite lesson from Day 1
The meta-lesson of this lesson about something as simple as Linux, the Linux development community and the community of open source developers developing better sysadmin tools is about the humility of learning from other human beings.
Of course, this lesson should PRECEED any serious work from the curated list of AWESOME bash scripts and resources, ie because before even using a script, you should really understand what you trying to automate and where this is going to end up … except, of course, as you use scripting you will learn how your own fluency in mastering the command line was and always will be far less than adequate. That is because this skill is fundamentally about trying to be better at learning how the best of several billion humans learn how to learn … maybe you can almost learn to keep up with the smartest person you know, but this is really about trying to keep up with the smartest group of 1000 or so learners from the very smartest of smart 1,000,000 people you might be able to know.