{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"ISRC Python Workshop: Introduction to Unix Bash commands\n",
"\n",
"___Introduction to Unix Bash commands___"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"While I use Jupyter notebooks for illustration purposes, it is more common to directly use [terminal](https://en.wikipedia.org/wiki/Terminal_(macOS)). You can find it in `Others -> Terminal` or by spotlight search.\n",
"\n",
"If you are also interested in using Bash in notebook, please checkout [takluyver/bash_kernel](https://github.com/takluyver/bash_kernel)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"@author: Zhiya Zuo\n",
"\n",
"@email: zhiya-zuo@uiowa.edu"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Introduction"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Bash commands are no different from many other languages such as Java or Python. We can what we code. For example, we can print out the current working directory."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:52:46.055155Z",
"start_time": "2018-02-08T17:52:45.947370Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/Users/zhiyzuo/OneDrive - University of Iowa/ISRC Python Workshop\n"
]
}
],
"source": [
"pwd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also print out what are in the current working directories."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:52:46.526607Z",
"start_time": "2018-02-08T17:52:46.405939Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[31m0-Installation-Environment-Setup.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m1-Variables-Data_Structures-Control_Logic.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m2-Functions-External_Libraries-File_IO.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m3-Network-Analysis-with-NetworkX.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m4-Visualization-with-Matplotlib.ipynb\u001b[39;49m\u001b[0m\n",
"5-Getting-Data-Using-APIs.ipynb\n",
"\u001b[31mBash-Tutorial.ipynb\u001b[39;49m\u001b[0m\n",
"another_tmp_pd.csv\n",
"\u001b[34marchived-files\u001b[39;49m\u001b[0m\n",
"\u001b[34mdata\u001b[39;49m\u001b[0m\n",
"renamed_tmp1.csv\n",
"\u001b[34msample-data\u001b[39;49m\u001b[0m\n",
"tmp_pd.csv\n",
"\u001b[31mtwitter_keys.csv\u001b[39;49m\u001b[0m\n",
"weather_keys.csv\n"
]
}
],
"source": [
"ls"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Just as graphical user interfaces (GUIs), we can speak \"bash language\" to interact with our computers. In fact, they are more powerful. For example, you cannot use the [HPC](https://hpc.uiowa.edu/) systems until you know something about shell programming. Note that Bash is only one of the shell programs but probably the most popular one.\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Working directories"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's get started by the concept of ___working directory___. As its name suggests, working directory is where you work in, or just which folder/directory you are at right now. As the previous example shows, there is a ___program___ called `pwd` that can help us do such thing."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:52:48.818752Z",
"start_time": "2018-02-08T17:52:48.706553Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/Users/zhiyzuo/OneDrive - University of Iowa/ISRC Python Workshop\n"
]
}
],
"source": [
"pwd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And, as before, we can list what we have in our current working directory by `ls`"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:52:49.642273Z",
"start_time": "2018-02-08T17:52:49.518847Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[31m0-Installation-Environment-Setup.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m1-Variables-Data_Structures-Control_Logic.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m2-Functions-External_Libraries-File_IO.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m3-Network-Analysis-with-NetworkX.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m4-Visualization-with-Matplotlib.ipynb\u001b[39;49m\u001b[0m\n",
"5-Getting-Data-Using-APIs.ipynb\n",
"\u001b[31mBash-Tutorial.ipynb\u001b[39;49m\u001b[0m\n",
"another_tmp_pd.csv\n",
"\u001b[34marchived-files\u001b[39;49m\u001b[0m\n",
"\u001b[34mdata\u001b[39;49m\u001b[0m\n",
"renamed_tmp1.csv\n",
"\u001b[34msample-data\u001b[39;49m\u001b[0m\n",
"tmp_pd.csv\n",
"\u001b[31mtwitter_keys.csv\u001b[39;49m\u001b[0m\n",
"weather_keys.csv\n"
]
}
],
"source": [
"ls"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"What if I do not want to stay here? Suppose I want to go to ___sample-data___ folder, I can ___change directory___ by `cd`"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:52:51.148778Z",
"start_time": "2018-02-08T17:52:51.038652Z"
}
},
"outputs": [],
"source": [
"cd sample-data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can verify that we indeed changed our directory by `pwd` and `ls`"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:52:52.844832Z",
"start_time": "2018-02-08T17:52:52.735145Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/Users/zhiyzuo/OneDrive - University of Iowa/ISRC Python Workshop/sample-data\n"
]
}
],
"source": [
"pwd"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:52:53.393487Z",
"start_time": "2018-02-08T17:52:53.279302Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[31mkarate.gml\u001b[39;49m\u001b[0m sample_tweets.csv \u001b[34mterrorists\u001b[39;49m\u001b[0m\n"
]
}
],
"source": [
"ls"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"However, there's no need to `cd` every time, if we just want to check or read some files in places outside our current working directory. We can use either the ___absolute___ or ___relative path___. What we get from `pwd` is an absolute path that shows the full path. Let's try an example with this. Let's say we are going to print the contents in a CSV file called ___tmp1.csv___, which is a level up compared to our current path."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:53:09.722352Z",
"start_time": "2018-02-08T17:53:09.604585Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n",
"1\n",
"2\n",
"3\n",
"4\n",
"5\n",
"6\n",
"7\n",
"8\n",
"9\n"
]
}
],
"source": [
"cat /Users/zhiyzuo/OneDrive\\ -\\ University\\ of\\ Iowa/ISRC\\ Python\\ Workshop/tmp1.csv"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that we use escaping characters for each space for our path (although this is actually NOT a good habit)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Instead of typing everything, we can use ___relative path___. The name is pretty self-explanatory: we can refer to a place, relative to our current working directory. Two important notations here:\n",
"- `.` (a dot) means current directory\n",
"- `..` (two dots without spaces) means upper level directory.\n",
"\n",
"For example, we can use paths after `ls` command to print files in that given path."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:53:12.271803Z",
"start_time": "2018-02-08T17:53:12.156676Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[31mkarate.gml\u001b[39;49m\u001b[0m sample_tweets.csv \u001b[34mterrorists\u001b[39;49m\u001b[0m\n"
]
}
],
"source": [
"ls ."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:53:12.758150Z",
"start_time": "2018-02-08T17:53:12.635878Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[31m0-Installation-Environment-Setup.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m1-Variables-Data_Structures-Control_Logic.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m2-Functions-External_Libraries-File_IO.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m3-Network-Analysis-with-NetworkX.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m4-Visualization-with-Matplotlib.ipynb\u001b[39;49m\u001b[0m\n",
"5-Getting-Data-Using-APIs.ipynb\n",
"\u001b[31mBash-Tutorial.ipynb\u001b[39;49m\u001b[0m\n",
"another_tmp_pd.csv\n",
"\u001b[34marchived-files\u001b[39;49m\u001b[0m\n",
"\u001b[34mdata\u001b[39;49m\u001b[0m\n",
"\u001b[34msample-data\u001b[39;49m\u001b[0m\n",
"tmp1.csv\n",
"tmp_pd.csv\n",
"\u001b[31mtwitter_keys.csv\u001b[39;49m\u001b[0m\n",
"weather_keys.csv\n"
]
}
],
"source": [
"ls .."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Therefore, by `..` (two dots ), we can go back one level without switching working directory:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:53:14.274526Z",
"start_time": "2018-02-08T17:53:14.149447Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n",
"1\n",
"2\n",
"3\n",
"4\n",
"5\n",
"6\n",
"7\n",
"8\n",
"9\n"
]
}
],
"source": [
"cat ../tmp1.csv"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, it is noteworthy that `~` (tilde) means ___home directory___ in unix systems."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:53:16.395663Z",
"start_time": "2018-02-08T17:53:16.273571Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[34mApplications\u001b[39;49m\u001b[0m alldump\n",
"\u001b[34mDesktop\u001b[39;49m\u001b[0m \u001b[34mecho\u001b[39;49m\u001b[0m\n",
"\u001b[34mDocuments\u001b[39;49m\u001b[0m mycert.pem\n",
"\u001b[34mDownloads\u001b[39;49m\u001b[0m mykey.key\n",
"\u001b[34mDropbox (Personal)\u001b[39;49m\u001b[0m \u001b[34mnltk_data\u001b[39;49m\u001b[0m\n",
"\u001b[34mDropbox (Zhiya-UIowa)\u001b[39;49m\u001b[0m pg_upgrade_internal.log\n",
"\u001b[34mGoogle Drive\u001b[39;49m\u001b[0m pg_upgrade_server.log\n",
"\u001b[34mLibrary\u001b[39;49m\u001b[0m pg_upgrade_utility.log\n",
"\u001b[34mMovies\u001b[39;49m\u001b[0m pub.bib\n",
"\u001b[34mMusic\u001b[39;49m\u001b[0m \u001b[34mscikit_learn_data\u001b[39;49m\u001b[0m\n",
"\u001b[34mOneDrive - University of Iowa\u001b[39;49m\u001b[0m \u001b[34mseaborn-data\u001b[39;49m\u001b[0m\n",
"\u001b[34mPictures\u001b[39;49m\u001b[0m \u001b[34mycm_build\u001b[39;49m\u001b[0m\n",
"\u001b[34mPublic\u001b[39;49m\u001b[0m\n"
]
}
],
"source": [
"ls ~"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Options/Input arguments "
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:15:19.222480Z",
"start_time": "2018-02-08T17:15:19.085949Z"
}
},
"source": [
"Shell commands can take input arguments or options. A convention is to use `-` (dash) to specify arguments. For example, we can ask `ls` to show detailed information of each file/folder:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:53:18.646962Z",
"start_time": "2018-02-08T17:53:18.521871Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 24\n",
"-rwxr-xr-x 1 zhiyzuo staff 5077 Jan 30 13:57 \u001b[31mkarate.gml\u001b[39;49m\u001b[0m\n",
"-rw-r--r-- 1 zhiyzuo staff 1950 Feb 7 14:57 sample_tweets.csv\n",
"drwxr-xr-x@ 7 zhiyzuo staff 224 Jan 30 13:57 \u001b[34mterrorists\u001b[39;49m\u001b[0m\n"
]
}
],
"source": [
"ls -l"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can aggregate different options by directly appending options one after another. The following example shows how to show size in human readable formats (`-h` option) along with a detailed view (`-l`)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:53:19.403679Z",
"start_time": "2018-02-08T17:53:19.274022Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 24\n",
"-rwxr-xr-x 1 zhiyzuo staff 5.0K Jan 30 13:57 \u001b[31mkarate.gml\u001b[39;49m\u001b[0m\n",
"-rw-r--r-- 1 zhiyzuo staff 1.9K Feb 7 14:57 sample_tweets.csv\n",
"drwxr-xr-x@ 7 zhiyzuo staff 224B Jan 30 13:57 \u001b[34mterrorists\u001b[39;49m\u001b[0m\n"
]
}
],
"source": [
"ls -lh"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Sometimes commands take in arguments for various purposes. Again, using `ls` as example, it can take ___path___ as an argument. Without the path, it will by default show the current listings, as shown above. Given a path, it will list items in that path:"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:53:22.786465Z",
"start_time": "2018-02-08T17:53:22.653003Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[31m0-Installation-Environment-Setup.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m1-Variables-Data_Structures-Control_Logic.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m2-Functions-External_Libraries-File_IO.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m3-Network-Analysis-with-NetworkX.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m4-Visualization-with-Matplotlib.ipynb\u001b[39;49m\u001b[0m\n",
"5-Getting-Data-Using-APIs.ipynb\n",
"\u001b[31mBash-Tutorial.ipynb\u001b[39;49m\u001b[0m\n",
"another_tmp_pd.csv\n",
"\u001b[34marchived-files\u001b[39;49m\u001b[0m\n",
"\u001b[34mdata\u001b[39;49m\u001b[0m\n",
"\u001b[34msample-data\u001b[39;49m\u001b[0m\n",
"tmp1.csv\n",
"tmp_pd.csv\n",
"\u001b[31mtwitter_keys.csv\u001b[39;49m\u001b[0m\n",
"weather_keys.csv\n"
]
}
],
"source": [
"ls ../"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:53:24.088264Z",
"start_time": "2018-02-08T17:53:23.975458Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[31m1_basics.ipynb\u001b[39;49m\u001b[0m \u001b[31m3_topic_modeling.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m2_web_scraping.ipynb\u001b[39;49m\u001b[0m \u001b[34mnotebooks_printouts\u001b[39;49m\u001b[0m\n"
]
}
],
"source": [
"ls ../archived-files"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that all these options can hardly be memorized. Often we will refer to the manual (or documentation). To do this, we can use `man command_name`. For example:"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:53:39.135458Z",
"start_time": "2018-02-08T17:53:38.908637Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"LS(1) BSD General Commands Manual LS(1)\n",
"\n",
"NAME\n",
" ls -- list directory contents\n",
"\n",
"SYNOPSIS\n",
" ls [-ABCFGHLOPRSTUW@abcdefghiklmnopqrstuwx1] [file ...]\n",
"\n",
"DESCRIPTION\n",
" For each operand that names a file of a type other than directory, ls\n",
" displays its name as well as any requested, associated information. For\n",
" each operand that names a file of type directory, ls displays the names\n",
" of files contained within that directory, as well as any requested, asso-\n",
" ciated information.\n",
"\n",
" If no operands are given, the contents of the current directory are dis-\n",
" played. If more than one operand is given, non-directory operands are\n",
" displayed first; directory and non-directory operands are sorted sepa-\n",
" rately and in lexicographical order.\n"
]
}
],
"source": [
"man ls | head -20"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here, `man` is a ___command___ that takes one input argument (which should be a Bash command) and outputs the corresponding manual. Therefore, we can definitely pull up the manual for `man` 🤓"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:53:48.571202Z",
"start_time": "2018-02-08T17:53:48.382295Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"man(1) man(1)\n",
"\n",
"\n",
"\n",
"NAME\n",
" man - format and display the on-line manual pages\n",
"\n",
"SYNOPSIS\n",
" man [-acdfFhkKtwW] [--path] [-m system] [-p string] [-C config_file]\n",
" [-M pathlist] [-P pager] [-B browser] [-H htmlpager] [-S section_list]\n",
" [section] name ...\n",
"\n",
"\n",
"DESCRIPTION\n",
" man formats and displays the on-line manual pages. If you specify sec-\n",
" tion, man only looks in that section of the manual. name is normally\n",
" the name of the manual page, which is typically the name of a command,\n",
" function, or file. However, if name contains a slash (/) then man\n",
" interprets it as a file specification, so that you can do man ./foo.5\n",
" or even man /cd/foo/bar.1.gz.\n"
]
}
],
"source": [
"man man | head -20"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that I use `| head -20` to limit the number of output to 20 lines/rows. `|` is ___pipe character___ and `head` is a command to show the ___head___ of some output, where `- 20` limit to the first 20 lines/rows. Detailed coverage is beyond the scope of this workshop though."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Some practical commands"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Given that we now understand some basics of Bash, it is a good time to know more commonly used commands. Before we do anything, I will swich my working directory back one level."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:55:35.913682Z",
"start_time": "2018-02-08T17:55:35.801053Z"
}
},
"outputs": [],
"source": [
"cd .."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:55:36.423141Z",
"start_time": "2018-02-08T17:55:36.316047Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/Users/zhiyzuo/OneDrive - University of Iowa/ISRC Python Workshop\n"
]
}
],
"source": [
"pwd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Move `mv`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Move command `mv` is an interesting one. You can use it to do two things\n",
"1. Move files/folders\n",
"2. Change file/folder names. Essentiall, `mv` rename an item by \"moving it to another item\""
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:33:47.821372Z",
"start_time": "2018-02-08T17:33:47.705802Z"
}
},
"source": [
"Let's try move the file ___tmp1.csv___ to one level up and move it back. Note that for `mv`, we need two arguments: \n",
"1. what to be moved?\n",
"2. where to?"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:55:39.107814Z",
"start_time": "2018-02-08T17:55:38.987546Z"
}
},
"outputs": [],
"source": [
"mv tmp1.csv ../"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Check if ___tmp1.csv___ is indeed in ___../___"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:55:39.777028Z",
"start_time": "2018-02-08T17:55:39.655904Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[34m2017summer\u001b[39;49m\u001b[0m \u001b[34mRanking-Hiring\u001b[39;49m\u001b[0m\n",
"\u001b[34m2018JCDL-poster\u001b[39;49m\u001b[0m \u001b[34mSupply-Chain-ABM\u001b[39;49m\u001b[0m\n",
"\u001b[34m2018spring\u001b[39;49m\u001b[0m \u001b[34mTime-Series-Precition\u001b[39;49m\u001b[0m\n",
"\u001b[34mCollaboration-in-Multi\u001b[39;49m\u001b[0m \u001b[34mTopics-over-Time-Replication\u001b[39;49m\u001b[0m\n",
"\u001b[34mDATA-ARCHIVES\u001b[39;49m\u001b[0m \u001b[34mWeibo\u001b[39;49m\u001b[0m\n",
"\u001b[34mData\u001b[39;49m\u001b[0m \u001b[34mdmig\u001b[39;49m\u001b[0m\n",
"\u001b[34mISRC Python Workshop\u001b[39;49m\u001b[0m \u001b[34miSchool\u001b[39;49m\u001b[0m\n",
"Icon? \u001b[34mjava-tm\u001b[39;49m\u001b[0m\n",
"\u001b[34mIntern-App-2018Summer\u001b[39;49m\u001b[0m \u001b[34mjava-topic-model\u001b[39;49m\u001b[0m\n",
"\u001b[34mJTM\u001b[39;49m\u001b[0m \u001b[34mpaper-review-invitation\u001b[39;49m\u001b[0m\n",
"\u001b[34mLearn-Notebooks\u001b[39;49m\u001b[0m \u001b[34mread-java-tm\u001b[39;49m\u001b[0m\n",
"\u001b[34mNRC\u001b[39;49m\u001b[0m tmp1.csv\n",
"\u001b[34mPolicy-School\u001b[39;49m\u001b[0m\n"
]
}
],
"source": [
"ls ../"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Move it back. Recall that `.` (dot) means current working directory"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:55:40.575067Z",
"start_time": "2018-02-08T17:55:40.462610Z"
}
},
"outputs": [],
"source": [
"mv ../tmp1.csv ."
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:55:40.943414Z",
"start_time": "2018-02-08T17:55:40.813761Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[31m0-Installation-Environment-Setup.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m1-Variables-Data_Structures-Control_Logic.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m2-Functions-External_Libraries-File_IO.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m3-Network-Analysis-with-NetworkX.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m4-Visualization-with-Matplotlib.ipynb\u001b[39;49m\u001b[0m\n",
"5-Getting-Data-Using-APIs.ipynb\n",
"\u001b[31mBash-Tutorial.ipynb\u001b[39;49m\u001b[0m\n",
"another_tmp_pd.csv\n",
"\u001b[34marchived-files\u001b[39;49m\u001b[0m\n",
"\u001b[34mdata\u001b[39;49m\u001b[0m\n",
"\u001b[34msample-data\u001b[39;49m\u001b[0m\n",
"tmp1.csv\n",
"tmp_pd.csv\n",
"\u001b[31mtwitter_keys.csv\u001b[39;49m\u001b[0m\n",
"weather_keys.csv\n"
]
}
],
"source": [
"ls"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can rename it by doing"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:55:42.044347Z",
"start_time": "2018-02-08T17:55:41.929627Z"
}
},
"outputs": [],
"source": [
"mv tmp1.csv renamed_tmp1.csv"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:55:42.518491Z",
"start_time": "2018-02-08T17:55:42.381033Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[31m0-Installation-Environment-Setup.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m1-Variables-Data_Structures-Control_Logic.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m2-Functions-External_Libraries-File_IO.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m3-Network-Analysis-with-NetworkX.ipynb\u001b[39;49m\u001b[0m\n",
"\u001b[31m4-Visualization-with-Matplotlib.ipynb\u001b[39;49m\u001b[0m\n",
"5-Getting-Data-Using-APIs.ipynb\n",
"\u001b[31mBash-Tutorial.ipynb\u001b[39;49m\u001b[0m\n",
"another_tmp_pd.csv\n",
"\u001b[34marchived-files\u001b[39;49m\u001b[0m\n",
"\u001b[34mdata\u001b[39;49m\u001b[0m\n",
"renamed_tmp1.csv\n",
"\u001b[34msample-data\u001b[39;49m\u001b[0m\n",
"tmp_pd.csv\n",
"\u001b[31mtwitter_keys.csv\u001b[39;49m\u001b[0m\n",
"weather_keys.csv\n"
]
}
],
"source": [
"ls"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And we can see there's a ___renamed_tmp1.csv___ but not ___tmp1.csv___ now."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Copy `cp`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copy command `cp` is very similar to `mv`, except that it is not moving but copying a specific file/folder. For example, we can duplicate ___tmp_pd.csv___ by doing:"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:55:45.297496Z",
"start_time": "2018-02-08T17:55:45.175897Z"
}
},
"outputs": [],
"source": [
"cp tmp_pd.csv another_tmp_pd.csv"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can verify by printing both files"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:55:46.033057Z",
"start_time": "2018-02-08T17:55:45.919177Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0,1,2\n",
"1.0,2.0,3.0\n",
"4.0,5.0,6.0\n"
]
}
],
"source": [
"cat tmp_pd.csv"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:55:46.790272Z",
"start_time": "2018-02-08T17:55:46.675250Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0,1,2\n",
"1.0,2.0,3.0\n",
"4.0,5.0,6.0\n"
]
}
],
"source": [
"cat another_tmp_pd.csv"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Reading files"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are many commands for doing this. We just used `cat`. `cat` will directly print everything into the terminal/standard output. It can also be used for concatenating files."
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"ExecuteTime": {
"end_time": "2018-02-08T17:55:48.681658Z",
"start_time": "2018-02-08T17:55:48.566314Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n",
"1\n",
"2\n",
"3\n",
"4\n",
"5\n",
"6\n",
"7\n",
"8\n",
"9\n"
]
}
],
"source": [
"cat renamed_tmp1.csv"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When we are reading large files, we probably don't want this. We can use `less`. Let's try it in terminal because `less` will not work propoerly in Jupyter notebook."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Conclusions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Due to time contraints, we can only cover these simple examples. There are really a lot more to read: `grep`, `sed`, `ssh`, `scp`, `ps`, `rm`, etc... Bash command is really powerful and used extensitvely for various purposes. Below I list two tutorials for Bash that I find really helpful."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Further readings:\n",
"\n",
"- http://www.bash.academy/\n",
"- https://ryanstutorials.net/bash-scripting-tutorial/"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Bash",
"language": "bash",
"name": "bash"
},
"language_info": {
"codemirror_mode": "shell",
"file_extension": ".sh",
"mimetype": "text/x-sh",
"name": "bash"
},
"toc": {
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"toc_cell": false,
"toc_position": {},
"toc_section_display": "block",
"toc_window_display": true
},
"varInspector": {
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"delete_cmd_postfix": "",
"delete_cmd_prefix": "del ",
"library": "var_list.py",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"delete_cmd_postfix": ") ",
"delete_cmd_prefix": "rm(",
"library": "var_list.r",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
],
"window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 2
}