2021-08-05
Note taking toolkit
Since I like to work on the (Linux) command line as much as possible, I developed a system for organizing helpful reminders and notes to myself. I originally developed this system in grad school as a means of maintaining a digital lab notebook. My note taking tool kit is a pretty simple system of a few scripts and bash aliases, and yet has proven to be some of the most useful code I've written to date. (The code isn't very cool, so don't get too excited. But the code is incredibly valuable by virtue of the fact that well-organized notes are inherently valuable.)
This page describes:
Philosophy of the organizational convention
Script to generate note files
Alias to edit generated note file
Script to add text-based embellishments to note files
Script to compile many individual note files into one intelligently organized "notebook" file
The links to download the scripts will take you to my github page; code for the aliases is simply printed on this page under the assumption that you know how to copy/paste it into your .bashrc to implement them.
1. Philosophy of the organizational convention
When I first started computational research, I maintained a Word document as my electronic "lab notebook." Obviously, this isn't a perfect substitute for a real lab notebook (which should be physical, non-electronic, written in pen, signed and dated, etc.), but has several advantages such as the ability to paste in screenshots, copy/paste in valuable links, helpful text, etc. For those reasons, a regular ol' Word document will still be helpful.
As I progressed further in my grad school research though, I increasingly discovered features that could make my "lab notebook" more useful. Some features I want in my notes are:
Quick and easy to make a note. Often times, a valuable note may be as simple as a single sentence that describes the data in a certain file or why a certain directory was renamed. I shouldn't have to fire up Word just to document such trivia.
Time-stamped. The date and time of a note is always a valuable piece of information. Since computers know the date and time, I shouldn't have to type that information manually.
Notes located where I need them. A huge draw-back of the single Word document style of note keeping is that, at best, you have pasted/typed in links/paths to connect your notes to the corresponding file/datum. But the process for retrieving information from about a data file can become quite convoluted when using the single document style. Do you find the file first and then search the Word document and hope you recorded some relevant note about it, or search the Word document for relevant notes and hope you recorded the path to the corresponding file? I learned that it was exceedingly beneficial to place the notes in the directories alongside the files they correspond to. This way, you will be able to immediately see any and all notes about the data/files at your current location in your file system.
Notes are useful to others. This is especially important if you're working in a group in which others share access to your file system. In an ideal world, any of your colleagues should be able to understand your files and organizational structure without undue mental anguish. (A little mental anguish will always be required.) Like the previous bullet point, creating notes in-place also means that colleagues won't have to search through your lab notebook in order to find relevant entries to help them understand what they're looking at wherever they are in your file system.
Easily organized. Sometimes a single document is still needed. Even though my convention will produce numerous individual notes when and where they refer to, it should still be relatively easy to compile these in some rational order into a single document, in case anyone happens to request a full lab "notebook." Similarly, it should be easy to see at a glance how many notes have been created, and where.
2. Script to generate note files
In order to fulfill the requirements outlined in the previous section, I needed a script that could generate note files with consistent and sequential names, allowing the computer to automatically time/date-stamp them. All the note files will be named note_YY-MM-DDL.txt, where YY is a two-digit year designation, MM a two-digit (zero padded) month designation, and DD a two-digit (zero padded) day designation. The L stands for a single lower case letter, a-z. Additional date- and time-stamping occurs inside the file that is generated.
These filenames have the advantage that, no matter how many are in the same directory, they will be listed (ls) and printed (cat) in chronological order. The obvious drawback is that you may have an identically-named note file located at ../ for example. But in my experience, the pros of this convention far outweigh the cons, and I have never had a problem with notes at different locations having the same filename.
The script that accomplishes this is called newnote.sh and is executed simply like this:
./newnote.sh
For example, the first execution in the present working directory produces a file called note_21-08-05a.txt, which simply looks like this:
08/05/2021
7:13:22 PM
---------------------
---------------------
---------------------
---------------------
---------------------
---------------------
################################################################################
################################################################################
Apart from the date and time, the other auto-generated elements are simply convenient visual aids. For example, I often use the sets of thin lines
---------------------
---------------------
to set apart useful lines of code that I might want to copy/paste into my notes. (E.g. a specific set of arguments to a more complicated script.)
The utility of the pair of octothorpe lines at the bottom of the file becomes apparent when you view multiple note files at once. For example, executing again in the same directory:
./newnote.sh
produces a second note file, this time named note_21-08-05b.txt. Simply printing these two files with one command:
cat note_21-08-05a.txt note_21-08-05b.txt
produces this output:
08/05/2021
7:13:22 PM
---------------------
---------------------
---------------------
---------------------
---------------------
---------------------
################################################################################
################################################################################
08/05/2021
7:37:20 PM
---------------------
---------------------
---------------------
---------------------
---------------------
---------------------
################################################################################
################################################################################
3. Alias to edit generated note file
Since newnote.sh generates very similarly-named files, and I like to make multiple notes in the same directory (keeping each note a max length of < 1 printed page) rather than typing everything into one single note file, this bash alias can be very helpful for quickly selecting the most recently-created note file to edit:
alias lastnote='last=`ls -1rt | grep note_ | tail -n 1`; emacs $last'
If emacs isn't your thing, just edit the alias to use your favorite text editor.
I also find that often the most recent note will be a sufficient reminder of what I was working on in the present directory, so this alias can help you quickly print only the most recent note, regardless of its filename:
alias lastnotecat='last=`ls -1rt | grep note_ | tail -n 1`; cat $last'
4. Script to add text-based embellishments to note files
Most of the time my notes just include a couple of sentences about what I'm working on at the moment and a few lines of code. For those purposes, the default blank notes shown above work fine. However, if I'm having trouble remembering what's in a given directory, I need to browse all my notes with a command like more note_*, and do a quick visual scan, often for just one or two specific pieces of information. So, in order to help set apart extra special notes, I wrote a very simple script called starnotetxt.sh. This script is executed simply by:
./starnotetxt.sh
and merely outputs the following template to STDOUT:
*****************************************************************
** **
** **
** **
** **
** **
** **
** **
*****************************************************************
I use this script by simply copy/pasting the above template output into my note_??-??-???.txt file, inserting the text I wish to embellish:
*****************************************************************
** **
** Example of "embellished" text. I format text inside these **
** sections by hand, adding and removing spaces and lines as **
** needed. "Star notes" simply draw attention to important **
** messages. **
** **
*****************************************************************
5. Script to compile many individual note files into one intelligently organized "notebook" file
Using the above scripts and aliases, you'll be able to quickly create and review time-stamped, uniformly-named, simple .txt files of helpful reminder notes in any directory. You can browse these files with any of your favorite bash commands (e.g. grep or find), but what if you want to prepare one single "lab notebook" at the conclusion of a project? What is a colleague who doesn't have access to your whole filesystem would like a complete list of your project notes? Maybe you're trying to hand all your project notes to a new group member to carry on your work?
For this task, I've created the most complex script in the toolkit: compile_notebook.sh. This script creates a single text file by concatenating all the note_*.txt files found at and below the path depth at execution. It currently has three different options for the order in which the individual files are concatenated:
Semi-chronological order. This option identifies all the notes throughout all directories, and then concatenates the files into the output in the order in which the files were created, regardless of their directory depth. (The ordering may not be 100% chronological, since there can be a discrepancy between a given file's creation and last modification times.)
Directory depth order. This option prints all notes at one directory depth layer into the output before moving on to the next deeper directory level. Within a level/depth, files are concatenated in their alphabetical filename ordering.
"Find" order. This option runs an internal find command, and simply concatenates all notes into the output in whatever order they were identified by find.
Like many of my (non-trivial) scripts, I've added my own "manual/help" page, which is obtained by using the script's -h option:
./compile_notebook.sh -h
A word of caution
I make no claims as to the efficiency of this script; but it gets the job done, and has worked wonders for me. In my experience, this script can take a few hours if you have ~O(10,000) or more directories. I believe the speed bottleneck is related to the time required for bash to find all the note files. This makes some sense: if you have a very large number of directories, it is going to take bash a long time to traverse your file system to locate all your notes. But when I'm trying to compile notes from a research project I've worked on for months/years, I'm willing to wait a few hours to obtain automatically- and well-organized single notebook.