X Tutup
{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "ISRC Python Workshop: Introduction to Unix Bash commands\n", "\n", "___Introduction to Unix Bash commands___" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While I use Jupyter notebooks for illustration purposes, it is more common to directly use [terminal](https://en.wikipedia.org/wiki/Terminal_(macOS)). You can find it in `Others -> Terminal` or by spotlight search.\n", "\n", "If you are also interested in using Bash in notebook, please checkout [takluyver/bash_kernel](https://github.com/takluyver/bash_kernel)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "@author: Zhiya Zuo\n", "\n", "@email: zhiya-zuo@uiowa.edu" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Introduction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Bash commands are no different from many other languages such as Java or Python. We can what we code. For example, we can print out the current working directory." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:52:46.055155Z", "start_time": "2018-02-08T17:52:45.947370Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/Users/zhiyzuo/OneDrive - University of Iowa/ISRC Python Workshop\n" ] } ], "source": [ "pwd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also print out what are in the current working directories." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:52:46.526607Z", "start_time": "2018-02-08T17:52:46.405939Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[31m0-Installation-Environment-Setup.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m1-Variables-Data_Structures-Control_Logic.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m2-Functions-External_Libraries-File_IO.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m3-Network-Analysis-with-NetworkX.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m4-Visualization-with-Matplotlib.ipynb\u001b[39;49m\u001b[0m\n", "5-Getting-Data-Using-APIs.ipynb\n", "\u001b[31mBash-Tutorial.ipynb\u001b[39;49m\u001b[0m\n", "another_tmp_pd.csv\n", "\u001b[34marchived-files\u001b[39;49m\u001b[0m\n", "\u001b[34mdata\u001b[39;49m\u001b[0m\n", "renamed_tmp1.csv\n", "\u001b[34msample-data\u001b[39;49m\u001b[0m\n", "tmp_pd.csv\n", "\u001b[31mtwitter_keys.csv\u001b[39;49m\u001b[0m\n", "weather_keys.csv\n" ] } ], "source": [ "ls" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Just as graphical user interfaces (GUIs), we can speak \"bash language\" to interact with our computers. In fact, they are more powerful. For example, you cannot use the [HPC](https://hpc.uiowa.edu/) systems until you know something about shell programming. Note that Bash is only one of the shell programs but probably the most popular one.\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Working directories" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's get started by the concept of ___working directory___. As its name suggests, working directory is where you work in, or just which folder/directory you are at right now. As the previous example shows, there is a ___program___ called `pwd` that can help us do such thing." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:52:48.818752Z", "start_time": "2018-02-08T17:52:48.706553Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/Users/zhiyzuo/OneDrive - University of Iowa/ISRC Python Workshop\n" ] } ], "source": [ "pwd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And, as before, we can list what we have in our current working directory by `ls`" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:52:49.642273Z", "start_time": "2018-02-08T17:52:49.518847Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[31m0-Installation-Environment-Setup.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m1-Variables-Data_Structures-Control_Logic.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m2-Functions-External_Libraries-File_IO.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m3-Network-Analysis-with-NetworkX.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m4-Visualization-with-Matplotlib.ipynb\u001b[39;49m\u001b[0m\n", "5-Getting-Data-Using-APIs.ipynb\n", "\u001b[31mBash-Tutorial.ipynb\u001b[39;49m\u001b[0m\n", "another_tmp_pd.csv\n", "\u001b[34marchived-files\u001b[39;49m\u001b[0m\n", "\u001b[34mdata\u001b[39;49m\u001b[0m\n", "renamed_tmp1.csv\n", "\u001b[34msample-data\u001b[39;49m\u001b[0m\n", "tmp_pd.csv\n", "\u001b[31mtwitter_keys.csv\u001b[39;49m\u001b[0m\n", "weather_keys.csv\n" ] } ], "source": [ "ls" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What if I do not want to stay here? Suppose I want to go to ___sample-data___ folder, I can ___change directory___ by `cd`" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:52:51.148778Z", "start_time": "2018-02-08T17:52:51.038652Z" } }, "outputs": [], "source": [ "cd sample-data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can verify that we indeed changed our directory by `pwd` and `ls`" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:52:52.844832Z", "start_time": "2018-02-08T17:52:52.735145Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/Users/zhiyzuo/OneDrive - University of Iowa/ISRC Python Workshop/sample-data\n" ] } ], "source": [ "pwd" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:52:53.393487Z", "start_time": "2018-02-08T17:52:53.279302Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[31mkarate.gml\u001b[39;49m\u001b[0m sample_tweets.csv \u001b[34mterrorists\u001b[39;49m\u001b[0m\n" ] } ], "source": [ "ls" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, there's no need to `cd` every time, if we just want to check or read some files in places outside our current working directory. We can use either the ___absolute___ or ___relative path___. What we get from `pwd` is an absolute path that shows the full path. Let's try an example with this. Let's say we are going to print the contents in a CSV file called ___tmp1.csv___, which is a level up compared to our current path." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:53:09.722352Z", "start_time": "2018-02-08T17:53:09.604585Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "3\n", "4\n", "5\n", "6\n", "7\n", "8\n", "9\n" ] } ], "source": [ "cat /Users/zhiyzuo/OneDrive\\ -\\ University\\ of\\ Iowa/ISRC\\ Python\\ Workshop/tmp1.csv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that we use escaping characters for each space for our path (although this is actually NOT a good habit)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Instead of typing everything, we can use ___relative path___. The name is pretty self-explanatory: we can refer to a place, relative to our current working directory. Two important notations here:\n", "- `.` (a dot) means current directory\n", "- `..` (two dots without spaces) means upper level directory.\n", "\n", "For example, we can use paths after `ls` command to print files in that given path." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:53:12.271803Z", "start_time": "2018-02-08T17:53:12.156676Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[31mkarate.gml\u001b[39;49m\u001b[0m sample_tweets.csv \u001b[34mterrorists\u001b[39;49m\u001b[0m\n" ] } ], "source": [ "ls ." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:53:12.758150Z", "start_time": "2018-02-08T17:53:12.635878Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[31m0-Installation-Environment-Setup.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m1-Variables-Data_Structures-Control_Logic.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m2-Functions-External_Libraries-File_IO.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m3-Network-Analysis-with-NetworkX.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m4-Visualization-with-Matplotlib.ipynb\u001b[39;49m\u001b[0m\n", "5-Getting-Data-Using-APIs.ipynb\n", "\u001b[31mBash-Tutorial.ipynb\u001b[39;49m\u001b[0m\n", "another_tmp_pd.csv\n", "\u001b[34marchived-files\u001b[39;49m\u001b[0m\n", "\u001b[34mdata\u001b[39;49m\u001b[0m\n", "\u001b[34msample-data\u001b[39;49m\u001b[0m\n", "tmp1.csv\n", "tmp_pd.csv\n", "\u001b[31mtwitter_keys.csv\u001b[39;49m\u001b[0m\n", "weather_keys.csv\n" ] } ], "source": [ "ls .." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Therefore, by `..` (two dots ), we can go back one level without switching working directory:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:53:14.274526Z", "start_time": "2018-02-08T17:53:14.149447Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "3\n", "4\n", "5\n", "6\n", "7\n", "8\n", "9\n" ] } ], "source": [ "cat ../tmp1.csv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, it is noteworthy that `~` (tilde) means ___home directory___ in unix systems." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:53:16.395663Z", "start_time": "2018-02-08T17:53:16.273571Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34mApplications\u001b[39;49m\u001b[0m alldump\n", "\u001b[34mDesktop\u001b[39;49m\u001b[0m \u001b[34mecho\u001b[39;49m\u001b[0m\n", "\u001b[34mDocuments\u001b[39;49m\u001b[0m mycert.pem\n", "\u001b[34mDownloads\u001b[39;49m\u001b[0m mykey.key\n", "\u001b[34mDropbox (Personal)\u001b[39;49m\u001b[0m \u001b[34mnltk_data\u001b[39;49m\u001b[0m\n", "\u001b[34mDropbox (Zhiya-UIowa)\u001b[39;49m\u001b[0m pg_upgrade_internal.log\n", "\u001b[34mGoogle Drive\u001b[39;49m\u001b[0m pg_upgrade_server.log\n", "\u001b[34mLibrary\u001b[39;49m\u001b[0m pg_upgrade_utility.log\n", "\u001b[34mMovies\u001b[39;49m\u001b[0m pub.bib\n", "\u001b[34mMusic\u001b[39;49m\u001b[0m \u001b[34mscikit_learn_data\u001b[39;49m\u001b[0m\n", "\u001b[34mOneDrive - University of Iowa\u001b[39;49m\u001b[0m \u001b[34mseaborn-data\u001b[39;49m\u001b[0m\n", "\u001b[34mPictures\u001b[39;49m\u001b[0m \u001b[34mycm_build\u001b[39;49m\u001b[0m\n", "\u001b[34mPublic\u001b[39;49m\u001b[0m\n" ] } ], "source": [ "ls ~" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Options/Input arguments " ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:15:19.222480Z", "start_time": "2018-02-08T17:15:19.085949Z" } }, "source": [ "Shell commands can take input arguments or options. A convention is to use `-` (dash) to specify arguments. For example, we can ask `ls` to show detailed information of each file/folder:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:53:18.646962Z", "start_time": "2018-02-08T17:53:18.521871Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "total 24\n", "-rwxr-xr-x 1 zhiyzuo staff 5077 Jan 30 13:57 \u001b[31mkarate.gml\u001b[39;49m\u001b[0m\n", "-rw-r--r-- 1 zhiyzuo staff 1950 Feb 7 14:57 sample_tweets.csv\n", "drwxr-xr-x@ 7 zhiyzuo staff 224 Jan 30 13:57 \u001b[34mterrorists\u001b[39;49m\u001b[0m\n" ] } ], "source": [ "ls -l" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can aggregate different options by directly appending options one after another. The following example shows how to show size in human readable formats (`-h` option) along with a detailed view (`-l`)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:53:19.403679Z", "start_time": "2018-02-08T17:53:19.274022Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "total 24\n", "-rwxr-xr-x 1 zhiyzuo staff 5.0K Jan 30 13:57 \u001b[31mkarate.gml\u001b[39;49m\u001b[0m\n", "-rw-r--r-- 1 zhiyzuo staff 1.9K Feb 7 14:57 sample_tweets.csv\n", "drwxr-xr-x@ 7 zhiyzuo staff 224B Jan 30 13:57 \u001b[34mterrorists\u001b[39;49m\u001b[0m\n" ] } ], "source": [ "ls -lh" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sometimes commands take in arguments for various purposes. Again, using `ls` as example, it can take ___path___ as an argument. Without the path, it will by default show the current listings, as shown above. Given a path, it will list items in that path:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:53:22.786465Z", "start_time": "2018-02-08T17:53:22.653003Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[31m0-Installation-Environment-Setup.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m1-Variables-Data_Structures-Control_Logic.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m2-Functions-External_Libraries-File_IO.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m3-Network-Analysis-with-NetworkX.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m4-Visualization-with-Matplotlib.ipynb\u001b[39;49m\u001b[0m\n", "5-Getting-Data-Using-APIs.ipynb\n", "\u001b[31mBash-Tutorial.ipynb\u001b[39;49m\u001b[0m\n", "another_tmp_pd.csv\n", "\u001b[34marchived-files\u001b[39;49m\u001b[0m\n", "\u001b[34mdata\u001b[39;49m\u001b[0m\n", "\u001b[34msample-data\u001b[39;49m\u001b[0m\n", "tmp1.csv\n", "tmp_pd.csv\n", "\u001b[31mtwitter_keys.csv\u001b[39;49m\u001b[0m\n", "weather_keys.csv\n" ] } ], "source": [ "ls ../" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:53:24.088264Z", "start_time": "2018-02-08T17:53:23.975458Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[31m1_basics.ipynb\u001b[39;49m\u001b[0m \u001b[31m3_topic_modeling.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m2_web_scraping.ipynb\u001b[39;49m\u001b[0m \u001b[34mnotebooks_printouts\u001b[39;49m\u001b[0m\n" ] } ], "source": [ "ls ../archived-files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that all these options can hardly be memorized. Often we will refer to the manual (or documentation). To do this, we can use `man command_name`. For example:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:53:39.135458Z", "start_time": "2018-02-08T17:53:38.908637Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "LS(1) BSD General Commands Manual LS(1)\n", "\n", "NAME\n", " ls -- list directory contents\n", "\n", "SYNOPSIS\n", " ls [-ABCFGHLOPRSTUW@abcdefghiklmnopqrstuwx1] [file ...]\n", "\n", "DESCRIPTION\n", " For each operand that names a file of a type other than directory, ls\n", " displays its name as well as any requested, associated information. For\n", " each operand that names a file of type directory, ls displays the names\n", " of files contained within that directory, as well as any requested, asso-\n", " ciated information.\n", "\n", " If no operands are given, the contents of the current directory are dis-\n", " played. If more than one operand is given, non-directory operands are\n", " displayed first; directory and non-directory operands are sorted sepa-\n", " rately and in lexicographical order.\n" ] } ], "source": [ "man ls | head -20" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, `man` is a ___command___ that takes one input argument (which should be a Bash command) and outputs the corresponding manual. Therefore, we can definitely pull up the manual for `man` 🤓" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:53:48.571202Z", "start_time": "2018-02-08T17:53:48.382295Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "man(1) man(1)\n", "\n", "\n", "\n", "NAME\n", " man - format and display the on-line manual pages\n", "\n", "SYNOPSIS\n", " man [-acdfFhkKtwW] [--path] [-m system] [-p string] [-C config_file]\n", " [-M pathlist] [-P pager] [-B browser] [-H htmlpager] [-S section_list]\n", " [section] name ...\n", "\n", "\n", "DESCRIPTION\n", " man formats and displays the on-line manual pages. If you specify sec-\n", " tion, man only looks in that section of the manual. name is normally\n", " the name of the manual page, which is typically the name of a command,\n", " function, or file. However, if name contains a slash (/) then man\n", " interprets it as a file specification, so that you can do man ./foo.5\n", " or even man /cd/foo/bar.1.gz.\n" ] } ], "source": [ "man man | head -20" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that I use `| head -20` to limit the number of output to 20 lines/rows. `|` is ___pipe character___ and `head` is a command to show the ___head___ of some output, where `- 20` limit to the first 20 lines/rows. Detailed coverage is beyond the scope of this workshop though." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Some practical commands" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Given that we now understand some basics of Bash, it is a good time to know more commonly used commands. Before we do anything, I will swich my working directory back one level." ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:55:35.913682Z", "start_time": "2018-02-08T17:55:35.801053Z" } }, "outputs": [], "source": [ "cd .." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:55:36.423141Z", "start_time": "2018-02-08T17:55:36.316047Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/Users/zhiyzuo/OneDrive - University of Iowa/ISRC Python Workshop\n" ] } ], "source": [ "pwd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Move `mv`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Move command `mv` is an interesting one. You can use it to do two things\n", "1. Move files/folders\n", "2. Change file/folder names. Essentiall, `mv` rename an item by \"moving it to another item\"" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:33:47.821372Z", "start_time": "2018-02-08T17:33:47.705802Z" } }, "source": [ "Let's try move the file ___tmp1.csv___ to one level up and move it back. Note that for `mv`, we need two arguments: \n", "1. what to be moved?\n", "2. where to?" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:55:39.107814Z", "start_time": "2018-02-08T17:55:38.987546Z" } }, "outputs": [], "source": [ "mv tmp1.csv ../" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check if ___tmp1.csv___ is indeed in ___../___" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:55:39.777028Z", "start_time": "2018-02-08T17:55:39.655904Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[34m2017summer\u001b[39;49m\u001b[0m \u001b[34mRanking-Hiring\u001b[39;49m\u001b[0m\n", "\u001b[34m2018JCDL-poster\u001b[39;49m\u001b[0m \u001b[34mSupply-Chain-ABM\u001b[39;49m\u001b[0m\n", "\u001b[34m2018spring\u001b[39;49m\u001b[0m \u001b[34mTime-Series-Precition\u001b[39;49m\u001b[0m\n", "\u001b[34mCollaboration-in-Multi\u001b[39;49m\u001b[0m \u001b[34mTopics-over-Time-Replication\u001b[39;49m\u001b[0m\n", "\u001b[34mDATA-ARCHIVES\u001b[39;49m\u001b[0m \u001b[34mWeibo\u001b[39;49m\u001b[0m\n", "\u001b[34mData\u001b[39;49m\u001b[0m \u001b[34mdmig\u001b[39;49m\u001b[0m\n", "\u001b[34mISRC Python Workshop\u001b[39;49m\u001b[0m \u001b[34miSchool\u001b[39;49m\u001b[0m\n", "Icon? \u001b[34mjava-tm\u001b[39;49m\u001b[0m\n", "\u001b[34mIntern-App-2018Summer\u001b[39;49m\u001b[0m \u001b[34mjava-topic-model\u001b[39;49m\u001b[0m\n", "\u001b[34mJTM\u001b[39;49m\u001b[0m \u001b[34mpaper-review-invitation\u001b[39;49m\u001b[0m\n", "\u001b[34mLearn-Notebooks\u001b[39;49m\u001b[0m \u001b[34mread-java-tm\u001b[39;49m\u001b[0m\n", "\u001b[34mNRC\u001b[39;49m\u001b[0m tmp1.csv\n", "\u001b[34mPolicy-School\u001b[39;49m\u001b[0m\n" ] } ], "source": [ "ls ../" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Move it back. Recall that `.` (dot) means current working directory" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:55:40.575067Z", "start_time": "2018-02-08T17:55:40.462610Z" } }, "outputs": [], "source": [ "mv ../tmp1.csv ." ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:55:40.943414Z", "start_time": "2018-02-08T17:55:40.813761Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[31m0-Installation-Environment-Setup.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m1-Variables-Data_Structures-Control_Logic.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m2-Functions-External_Libraries-File_IO.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m3-Network-Analysis-with-NetworkX.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m4-Visualization-with-Matplotlib.ipynb\u001b[39;49m\u001b[0m\n", "5-Getting-Data-Using-APIs.ipynb\n", "\u001b[31mBash-Tutorial.ipynb\u001b[39;49m\u001b[0m\n", "another_tmp_pd.csv\n", "\u001b[34marchived-files\u001b[39;49m\u001b[0m\n", "\u001b[34mdata\u001b[39;49m\u001b[0m\n", "\u001b[34msample-data\u001b[39;49m\u001b[0m\n", "tmp1.csv\n", "tmp_pd.csv\n", "\u001b[31mtwitter_keys.csv\u001b[39;49m\u001b[0m\n", "weather_keys.csv\n" ] } ], "source": [ "ls" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can rename it by doing" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:55:42.044347Z", "start_time": "2018-02-08T17:55:41.929627Z" } }, "outputs": [], "source": [ "mv tmp1.csv renamed_tmp1.csv" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:55:42.518491Z", "start_time": "2018-02-08T17:55:42.381033Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[31m0-Installation-Environment-Setup.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m1-Variables-Data_Structures-Control_Logic.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m2-Functions-External_Libraries-File_IO.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m3-Network-Analysis-with-NetworkX.ipynb\u001b[39;49m\u001b[0m\n", "\u001b[31m4-Visualization-with-Matplotlib.ipynb\u001b[39;49m\u001b[0m\n", "5-Getting-Data-Using-APIs.ipynb\n", "\u001b[31mBash-Tutorial.ipynb\u001b[39;49m\u001b[0m\n", "another_tmp_pd.csv\n", "\u001b[34marchived-files\u001b[39;49m\u001b[0m\n", "\u001b[34mdata\u001b[39;49m\u001b[0m\n", "renamed_tmp1.csv\n", "\u001b[34msample-data\u001b[39;49m\u001b[0m\n", "tmp_pd.csv\n", "\u001b[31mtwitter_keys.csv\u001b[39;49m\u001b[0m\n", "weather_keys.csv\n" ] } ], "source": [ "ls" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And we can see there's a ___renamed_tmp1.csv___ but not ___tmp1.csv___ now." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Copy `cp`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Copy command `cp` is very similar to `mv`, except that it is not moving but copying a specific file/folder. For example, we can duplicate ___tmp_pd.csv___ by doing:" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:55:45.297496Z", "start_time": "2018-02-08T17:55:45.175897Z" } }, "outputs": [], "source": [ "cp tmp_pd.csv another_tmp_pd.csv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can verify by printing both files" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:55:46.033057Z", "start_time": "2018-02-08T17:55:45.919177Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0,1,2\n", "1.0,2.0,3.0\n", "4.0,5.0,6.0\n" ] } ], "source": [ "cat tmp_pd.csv" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:55:46.790272Z", "start_time": "2018-02-08T17:55:46.675250Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0,1,2\n", "1.0,2.0,3.0\n", "4.0,5.0,6.0\n" ] } ], "source": [ "cat another_tmp_pd.csv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Reading files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are many commands for doing this. We just used `cat`. `cat` will directly print everything into the terminal/standard output. It can also be used for concatenating files." ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "ExecuteTime": { "end_time": "2018-02-08T17:55:48.681658Z", "start_time": "2018-02-08T17:55:48.566314Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "3\n", "4\n", "5\n", "6\n", "7\n", "8\n", "9\n" ] } ], "source": [ "cat renamed_tmp1.csv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we are reading large files, we probably don't want this. We can use `less`. Let's try it in terminal because `less` will not work propoerly in Jupyter notebook." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Conclusions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Due to time contraints, we can only cover these simple examples. There are really a lot more to read: `grep`, `sed`, `ssh`, `scp`, `ps`, `rm`, etc... Bash command is really powerful and used extensitvely for various purposes. Below I list two tutorials for Bash that I find really helpful." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Further readings:\n", "\n", "- http://www.bash.academy/\n", "- https://ryanstutorials.net/bash-scripting-tutorial/" ] } ], "metadata": { "kernelspec": { "display_name": "Bash", "language": "bash", "name": "bash" }, "language_info": { "codemirror_mode": "shell", "file_extension": ".sh", "mimetype": "text/x-sh", "name": "bash" }, "toc": { "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "toc_cell": false, "toc_position": {}, "toc_section_display": "block", "toc_window_display": true }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }
X Tutup