Text Files editors in the Terminal
command | function | syntax/example usage |
---|---|---|
nano |
edit file | nano FILENAME |
vim |
edit a file | vim FILENAME |
nano
– Text editor more like a GUI
Nano opens up and will feel like a typical text editor you are familiar with. Arrow keys can be used navigate the text. Below are some additional shortcuts.
NANO SHORTCUTS | functionality |
---|---|
ctrl+o |
save file |
ctrl+x |
close file |
alt+/ |
go to end of the file |
ctrl+a |
go to start of the line |
ctrl+e |
go to end of the line |
ctrl+c |
show line number |
ctrl+_ |
go to line number |
ctrl+w |
find matching word |
alt+w |
find next match |
ctrl+\ |
find and replace |
Copy and paste the following text into a file named myFirstfile.txt
The greatest challenge is sifting through all the data that is generated by the
simulation. In fact, it is impossible. Impossible because to record every
event everywhere in the artificial universe would take more hard drive space
than was physically possible to create given Earth's dwindling resources.
Therefore, my life's greatest achievement was to encode a method to filter the
data, so only the most relevant events related to what we want is recorded. In
our case, that means space travel. But not just any space travel, space travel
between galaxies.
Start by copying the text above using your mouse (or built-in button).
Then in a terminal use nano
to create a file named myFirstfile.txt
nano myFirstfile.txt
Paste your text and then hit ctr x
press y
for yes to save and hit enter
, which will return you to the prompt. This will save the file with the text in it.
VIM
vim
is another text editor that when you are more comfortable with the unix command line, it will be worth your time to learn. Many first time Unix users type in vim
to enter the editor and get stuck in the editor as it is less intuitive than nano. I will not go into great detail in this introductory tutorial but want to provide you with resources to explore on your own. There is a command line like feature in this editor that a user can use to execute very powerful text editing functions.
Vim | Very Basics Usage |
---|---|
esc |
Hitting escape will take you back to the original state when you opened vim. |
i |
insert mode: In this mode it is possible to add/edit the text |
esc :wq |
hitting escape then typing : w q will write(save) the file and quit |
Copying and pasting work like other editors as long as you are in the insert mode.
Syntax highlighting for Biology specific files
If you’re a coder, then you already know how useful the syntax highlighting is for your programming language. However, if you’re a biologist and work with lots of biology specific files (fasta, msa, vcf etc) on command-line, then you might have noticed how non-intuitive it feels for manually inspecting them. This tutorial aims to reduce that feeling a little bit and makes working on CLI a bit better!
If you want a hands-on approach with support to custom files, first section covers nano
text editor syntax coloring. Since for vim
has complicated syntax file formats, we will use a readymade solution that uses the program called BioSyntax
. BioSyntax comes with present syntax coloring for pretty much all types of files in Biology/Bioinformatics and works with various other commands as well (like less
, vim
etc.).
Nano as the text editor (custom syntax coloring)
For nano
most settings can be set in .nanorc
file located in your home directory (/home/username/.nanorc
or simply ~/.nanorc
).
You can add file specific syntax coloring by editing .nanorc
file. For example, for nucleotide file (file extensions .fna
.fasta
or .fa
), you can add this section:
# fasta format nucleotide sequences.
syntax "fasta" "\.fasta$" "\.fas$" "\.fa$"
color brightwhite "^\>.*"
color brightgreen "[Aa]"
color brightred "[Tt]"
color brightblue "[Cc]"
color brightyellow "[Gg]"
Similarly, for amino acid fasta files, you can add these lines (ending with .faa
or .aa
):
#colored based on aminoacid properties
syntax "fasta" "\.faa$" "\.aa$"
color brightwhite "^\>.*"
color brightblue "[AILMFWV]"
color brightred "[RK]"
color brightgreen "[NQ]"
color white "[C]"
color magenta "[ED]"
color red "[G]"
color cyan "[HY]"
color brightyellow "[P]"
color green "[ST]"
This can also be used for alignment files (nucleotide or proteins) by adding more extensions to the above syntax coloring. You can also add other coloring patterns for other file extensions using the same method.
Other examples:
FASTQ file
## For FASTQ file
syntax "fastq" "\.fq$" "\.fastq$"
color brightred "^(@|\+).*$"
color brightgreen "^[ATGCN]+$"
For DNA sequence in NEXUS/PAUP format
"nexus" "\.nexus$" "\.nex$"
color brightgreen "[Aa]"
color brightred "[Tt]"
color brightblue "[Cc]"
color brightyellow "[Gg]"
color brightwhite "(^[ ])\{?[0-9A-Z_!@#$*?-]+\}?"
color brightwhite start="\[" end="\]"
color brightred "(#NEXUS|End|;)"
color brightyellow "(Dimensions|Format|Matrix)"
color brightcyan "(Begin DATA|ntax|nchar|datatype|gap)"
color brightred "\=\{?[0-9A-Z_!@#$*?-]+\}?"
Alignment files (protein)
## For Protein CLUSTALW format
syntax "clustalw" "\.clw$" "\.aln$"
color brightblue "[AILMFWV]"
color brightred "[RK]"
color brightgreen "[NQ]"
color white "[C]"
color magenta "[ED]"
color red "[G]"
color cyan "[HY]"
color brightyellow "[P]"
color green "[ST]"
The screenshot for the above format:
Using BioSyntax (for vim and other tools)
BioSyntax integrates with vim, less, gedit, & sublime and automatically recognizes file formats for various biological file formats. The publication is available here and you should cite the article if you use this tool in your project.
Install
Follow these guidelines from their official website. SInce you will most likely not have sudo
access, you may have to install source-highlight
program manually first before attempting BioSyntax installation. Follow these steps:
wget ftp://ftp.gnu.org/gnu/src-highlite/source-highlight-3.1.8.tar.gz
tar xf source-highlight-3.1.8.tar.gz
mkdir /path/to/somedir/sourcehighlight/
cd source-highlight-3.1.8
./configure --prefix=/path/to/somedir/sourcehighlight
make
make install
Once installed, set the variables correctly so that the program can find the files it needs. Add these lines to your .bashrc
PATH=PATH:/path/to/installationdir/bin
LD_LIBRARY_PATH=LD_LIBRARY_PATH:/path/to/installationdir/lib
LIBRARY_PATH=LIBRARY_PATH:/path/to/installationdir/lib
PKG_CONFIG_PATH=PKG_CONFIG_PATH:/path/to/installationdir/lib/pkgconfig
CMAKE_LIBRARY_PATH=CMAKE_LIBRARY_PATH:/path/to/installationdir/lib
LD_LIBRARY_PATH=LD_LIBRARY_PATH:/path/to/installationdir/lib
C_INCLUDE_PATH=C_INCLUDE_PATH:/path/to/installationdir/include
CPLUS_INCLUDE_PATH=CPLUS_INCLUDE_PATH:/path/to/installationdir/include
CMAKE_INCLUDE_PATH=CMAKE_INCLUDE_PATH:/path/to/installationdir/include
MANPATH=MANPATH:/path/to/installationdir/share/man
Now, you are ready to install bioSyntax
wget https://github.com/bioSyntax/bioSyntax/releases/download/v1.0.0/bioSyntax-1.0.0.zip
unzip bioSyntax-1.0.0.zip
cd bioSyntax-master
# for installing colors for less
bash bioSyntax_INSTALL.sh less
# or for vim
bash bioSyntax_INSTALL.sh vim
restart the terminal and you’re all set to use the bioSyntax!
Further Reading
System info and access permissionsSuperuser command: sudo
Getting started with UNIX + video + exercises
Useful text manipulation programs
GREP – simple search for regular expressions
SED – replace pattern in stream text
AWK – advanced text processing
BIOAWK – biological data manipulation
DIFF, COMM, CMP – comparing and collating two files with Unix
UNIX Commands
MODULE 04: Development Environment