├── .ipynb_checkpoints └── 01.Introduction to Research-checkpoint.ipynb ├── 01.Introduction to Research.ipynb ├── 02.Introduction to Python.ipynb ├── 03.Introduction to NumPy.ipynb ├── 04.Introduction to Pandas.ipynb ├── 05.Plotting Data.ipynb ├── A Professional Quant Equity Workflow.md ├── README.md ├── data └── IF888-2011.mat ├── html └── 1.Introduction+to+Research.html └── image ├── cell_mode_change.jpg └── workflow.jpg /.ipynb_checkpoints/01.Introduction to Research-checkpoint.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Introduction to the Research Environment\n", 8 | "\n", 9 | "The research environment is powered by IPython notebooks, which allow one to perform a great deal of data analysis and statistical validation. We'll demonstrate a few simple techniques here." 10 | ] 11 | }, 12 | { 13 | "cell_type": "markdown", 14 | "metadata": {}, 15 | "source": [ 16 | "# 研究环境的介绍\n", 17 | "这是一个基于jupyter notebook(原IPython notebook)的研究环境,能出色的完成大数据的分析和统计验证。我们将在这里做一些简单的技巧演示。" 18 | ] 19 | }, 20 | { 21 | "cell_type": "markdown", 22 | "metadata": {}, 23 | "source": [ 24 | "## Code Cells vs. Text Cells\n", 25 | "\n", 26 | "As you can see, each cell can be either code or text. To select between them, choose from the 'Cell Type' dropdown menu on the top left." 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "metadata": {}, 32 | "source": [ 33 | "## 代码单元格 vs 文本单元格\n", 34 | "如你所见,单元格可以是代码编辑模式,也可以是文本编辑模式。可用通过点击顶部菜单的`Cell`的下拉菜单中`Cell Type`来进行选择。\n", 35 | "亦可通过顶部菜单中下拉菜单进行选择:`Code`即为代码编辑模式,`Markdown`即为文本编辑模式。如下图所示:\n", 36 | "\n", 37 | "![image](image/cell_mode_change.jpg)\n", 38 | "\n", 39 | "在单元格未进入编辑时,可以通过快捷键进行切换,`Y`切换成代码模式,`M`切换成文本编辑模式,当单元格处于编辑状态时,可以按`Esc`退出编辑模式, 按`Enter`进入编辑模式。" 40 | ] 41 | }, 42 | { 43 | "cell_type": "markdown", 44 | "metadata": {}, 45 | "source": [ 46 | "## Executing a Command\n", 47 | "\n", 48 | "A code cell will be evaluated when you press play, or when you press the shortcut, shift-enter. Evaluating a cell evaluates each line of code in sequence, and prints the results of the last line below the cell." 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "metadata": {}, 54 | "source": [ 55 | "## 执行一个命令\n", 56 | "当处于`代码编辑`模式时,通过快捷键`Shift`+`Enter`来逐行执行单元格中代码,并会打印出单元格最后一行代码执行的结果。" 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": null, 62 | "metadata": { 63 | "collapsed": true 64 | }, 65 | "outputs": [], 66 | "source": [ 67 | "2 + 2" 68 | ] 69 | }, 70 | { 71 | "cell_type": "markdown", 72 | "metadata": {}, 73 | "source": [ 74 | "有时候并没有打印结果,比如说赋值的时候." 75 | ] 76 | }, 77 | { 78 | "cell_type": "code", 79 | "execution_count": null, 80 | "metadata": { 81 | "collapsed": true 82 | }, 83 | "outputs": [], 84 | "source": [ 85 | "X = 2" 86 | ] 87 | }, 88 | { 89 | "cell_type": "markdown", 90 | "metadata": {}, 91 | "source": [ 92 | "记住,只有最后一行的代码执行的结果会被打印出来." 93 | ] 94 | }, 95 | { 96 | "cell_type": "code", 97 | "execution_count": null, 98 | "metadata": { 99 | "collapsed": true 100 | }, 101 | "outputs": [], 102 | "source": [ 103 | "2 + 2\n", 104 | "3 + 3" 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "metadata": {}, 110 | "source": [ 111 | "你可以通过`print`来打印你想要的代码结果." 112 | ] 113 | }, 114 | { 115 | "cell_type": "code", 116 | "execution_count": null, 117 | "metadata": { 118 | "collapsed": true, 119 | "scrolled": false 120 | }, 121 | "outputs": [], 122 | "source": [ 123 | "print(2 + 2)\n", 124 | "3 + 3" 125 | ] 126 | }, 127 | { 128 | "cell_type": "markdown", 129 | "metadata": {}, 130 | "source": [ 131 | "## Knowing When a Cell is Running\n", 132 | "\n", 133 | "While a cell is running, a `[*]` will dsiplay on the left. When a cell has yet to be executed, `[ ]` will display. When it has been run, a number will display indicating the order in which it was run during the execution of the notebook `[5]`. Try on this cell and note it happening." 134 | ] 135 | }, 136 | { 137 | "cell_type": "markdown", 138 | "metadata": {}, 139 | "source": [ 140 | "## 单元格的状态\n", 141 | "当一个单元格左侧的标记为`[*]`时,表明程序正在运行.`[ ]`表明单元格还未执行.当一个单元格运行完毕后,会在这对方括号中加上一个数字,表示已完成的状态,比如`[5]`.\n", 142 | "- 特别的说明:当你重新打开notebook,或者重启kernel(并未清理结果)之后,单元格左侧的方框虽然有数字,但其实并未运行." 143 | ] 144 | }, 145 | { 146 | "cell_type": "code", 147 | "execution_count": null, 148 | "metadata": {}, 149 | "outputs": [], 150 | "source": [ 151 | "# 花点时间去运行一段代码\n", 152 | "c = 0\n", 153 | "for i in range(10000000):\n", 154 | " c = c + i\n", 155 | "c" 156 | ] 157 | }, 158 | { 159 | "cell_type": "markdown", 160 | "metadata": {}, 161 | "source": [ 162 | "## Importing Libraries\n", 163 | "\n", 164 | "The vast majority of the time, you'll want to use functions from pre-built libraries. You can't import every library on Quantopian due to security issues, but you can import most of the common scientific ones. Here I import numpy and pandas, the two most common and useful libraries in quant finance. I recommend copying this import statement to every new notebook.\n", 165 | "\n", 166 | "Notice that you can rename libraries to whatever you want after importing. The `as` statement allows this. Here we use `np` and `pd` as aliases for `numpy` and `pandas`. This is a very common aliasing and will be found in most code snippets around the web. The point behind this is to allow you to type fewer characters when you are frequently accessing these libraries." 167 | ] 168 | }, 169 | { 170 | "cell_type": "markdown", 171 | "metadata": {}, 172 | "source": [ 173 | "## 导入库\n", 174 | "绝大多数时候,我们需要使用预构建库(pre-built libraries)中的函数。出于安全考虑(基于平台的角度),我们无法导入Quantopian上的每个库,但是可以导入最常见,用于科学计算的库。这里我将会导入`numpy`和`pandas`,两个在量化金融中常见且实用的库。\n", 175 | "我建议将这个导入代码复制到每一个新的notebook.\n", 176 | "\n", 177 | "你可以使用`as`关键字将导入的库重命名,我们将使用 `np`和`pd`分别作为`numpy`和`pandas`的别名,这算是全世界大家公认的别名。使用别名的好处就是当你频繁使用者这些库时,能输入较少的字符。" 178 | ] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "execution_count": null, 183 | "metadata": { 184 | "collapsed": true 185 | }, 186 | "outputs": [], 187 | "source": [ 188 | "import numpy as np\n", 189 | "import pandas as pd\n", 190 | "from scipy.io import loadmat\n", 191 | "# 这是一个在画图方面非常优秀的库\n", 192 | "import matplotlib.pyplot as plt\n", 193 | "import datetime\n", 194 | "\n", 195 | "%matplotlib inline" 196 | ] 197 | }, 198 | { 199 | "cell_type": "markdown", 200 | "metadata": {}, 201 | "source": [ 202 | "## Tab Autocomplete\n", 203 | "\n", 204 | "Pressing tab will give you a list of IPython's best guesses for what you might want to type next. This is incredibly valuable and will save you a lot of time. If there is only one possible option for what you could type next, IPython will fill that in for you. Try pressing tab very frequently, it will seldom fill in anything you don't want, as if there is ambiguity a list will be shown. This is a great way to see what functions are available in a library.\n", 205 | "\n", 206 | "Try placing your cursor after the `.` and pressing tab." 207 | ] 208 | }, 209 | { 210 | "cell_type": "markdown", 211 | "metadata": {}, 212 | "source": [ 213 | "## Tab键 代码自动补全\n", 214 | "当你按下`Tab`键时,会自动出现一个猜测你下一步,你可能输入内容的列表。这个功能很实用而且能省不少事。如果只有一个可能选项,IPython则会自动帮助你完成输入。请频繁的按下`Tab`吧,它几乎不会出现你不想要的内容,一如这里有歧义的话,就会出现一个列表。想看某个库中有哪些函数,这是一个不错的方式。\n", 215 | "\n", 216 | "请将鼠标的置于`np.random.`中最后一个`.`之后,并按下`Tab`。" 217 | ] 218 | }, 219 | { 220 | "cell_type": "code", 221 | "execution_count": null, 222 | "metadata": { 223 | "collapsed": true 224 | }, 225 | "outputs": [], 226 | "source": [ 227 | "np.random." 228 | ] 229 | }, 230 | { 231 | "cell_type": "markdown", 232 | "metadata": {}, 233 | "source": [ 234 | "## Getting Documentation Help\n", 235 | "\n", 236 | "Placing a question mark after a function and executing that line of code will give you the documentation IPython has for that function. It's often best to do this in a new cell, as you avoid re-executing other code and running into bugs." 237 | ] 238 | }, 239 | { 240 | "cell_type": "markdown", 241 | "metadata": {}, 242 | "source": [ 243 | "## 获取文档帮助\n", 244 | "在IPython中,在一个函数后面加上`?`并执行该代码,就会获得该函数的文本帮助。建议通常在一个新的单元格执行,这样能避免重新执行其他代码或者发生bug." 245 | ] 246 | }, 247 | { 248 | "cell_type": "code", 249 | "execution_count": null, 250 | "metadata": { 251 | "collapsed": true 252 | }, 253 | "outputs": [], 254 | "source": [ 255 | "np.random.normal?" 256 | ] 257 | }, 258 | { 259 | "cell_type": "markdown", 260 | "metadata": {}, 261 | "source": [ 262 | "## Sampling\n", 263 | "\n", 264 | "We'll sample some random data using a function from `numpy`." 265 | ] 266 | }, 267 | { 268 | "cell_type": "markdown", 269 | "metadata": {}, 270 | "source": [ 271 | "## 生成样本\n", 272 | "我们将用`numpy`中的函数随机生成一些数据" 273 | ] 274 | }, 275 | { 276 | "cell_type": "markdown", 277 | "metadata": {}, 278 | "source": [ 279 | "- 译者注:通常计算机中的随机函数,都是伪随机,在`numpy`中,我们通过使用`np.random.seed()`函数来控制随机结果" 280 | ] 281 | }, 282 | { 283 | "cell_type": "code", 284 | "execution_count": null, 285 | "metadata": { 286 | "collapsed": true 287 | }, 288 | "outputs": [], 289 | "source": [ 290 | "# Sample 100 points with a mean of 0 and an std of 1. This is a standard normal distribution.\n", 291 | "# 随机生成100样本点,这些点服从均值为0,标准差为1的正态分布。\n", 292 | "np.random.seed(1) # 确保大家生成的样本保持一致\n", 293 | "\n", 294 | "X = np.random.normal(0, 1, 100)" 295 | ] 296 | }, 297 | { 298 | "cell_type": "markdown", 299 | "metadata": {}, 300 | "source": [ 301 | "## Plotting\n", 302 | "\n", 303 | "We can use the plotting library we imported as follows." 304 | ] 305 | }, 306 | { 307 | "cell_type": "markdown", 308 | "metadata": {}, 309 | "source": [ 310 | "## 画图\n", 311 | "我们使用刚才导入的库进行画图\n", 312 | "\n", 313 | "`import matplotlib.pyplot as plt`" 314 | ] 315 | }, 316 | { 317 | "cell_type": "code", 318 | "execution_count": null, 319 | "metadata": { 320 | "collapsed": true 321 | }, 322 | "outputs": [], 323 | "source": [ 324 | "plt.plot(X)" 325 | ] 326 | }, 327 | { 328 | "cell_type": "markdown", 329 | "metadata": {}, 330 | "source": [ 331 | "### Squelching Line Output\n", 332 | "\n", 333 | "You might have noticed the annoying line of the form `[]` before the plots. This is because the `.plot` function actually produces output. Sometimes we wish not to display output, we can accomplish this with the semi-colon as follows." 334 | ] 335 | }, 336 | { 337 | "cell_type": "markdown", 338 | "metadata": {}, 339 | "source": [ 340 | "### 让输出更纯净\n", 341 | "你可能注意到我们画的图前面有一行非常讨厌的输出:`[]`\n", 342 | "\n", 343 | "这是`.plot`函数输出结果,有时候我们并不希望他出现,我们可以加上`;`实现这个功能;" 344 | ] 345 | }, 346 | { 347 | "cell_type": "code", 348 | "execution_count": null, 349 | "metadata": { 350 | "collapsed": true 351 | }, 352 | "outputs": [], 353 | "source": [ 354 | "plt.plot(X);" 355 | ] 356 | }, 357 | { 358 | "cell_type": "markdown", 359 | "metadata": {}, 360 | "source": [ 361 | "### Adding Axis Labels\n", 362 | "\n", 363 | "No self-respecting quant leaves a graph without labeled axes. Here are some commands to help with that." 364 | ] 365 | }, 366 | { 367 | "cell_type": "markdown", 368 | "metadata": {}, 369 | "source": [ 370 | "### 添加坐标轴标签\n", 371 | "有追求的宽客是不会让图表没有坐标轴标签的。\n", 372 | "\n", 373 | "这里有一些命令可以帮我们实现他。" 374 | ] 375 | }, 376 | { 377 | "cell_type": "code", 378 | "execution_count": null, 379 | "metadata": { 380 | "collapsed": true 381 | }, 382 | "outputs": [], 383 | "source": [ 384 | "np.random.seed(2)\n", 385 | "\n", 386 | "X = np.random.normal(0, 1, 100)\n", 387 | "X2 = np.random.normal(0, 1, 100)\n", 388 | "\n", 389 | "plt.plot(X);\n", 390 | "plt.plot(X2);\n", 391 | "plt.xlabel('Time') # 我们生成的数据是没有单位的,但不要忘记他的单位。\n", 392 | "plt.ylabel('Returns')\n", 393 | "plt.legend(['X', 'X2']);" 394 | ] 395 | }, 396 | { 397 | "cell_type": "markdown", 398 | "metadata": {}, 399 | "source": [ 400 | "## Generating Statistics\n", 401 | "\n", 402 | "Let's use `numpy` to take some simple statistics." 403 | ] 404 | }, 405 | { 406 | "cell_type": "markdown", 407 | "metadata": {}, 408 | "source": [ 409 | "## 计算统计值\n", 410 | "让我们用`numpy`来做一些简单的统计计算。" 411 | ] 412 | }, 413 | { 414 | "cell_type": "code", 415 | "execution_count": null, 416 | "metadata": { 417 | "collapsed": true 418 | }, 419 | "outputs": [], 420 | "source": [ 421 | "np.mean(X)" 422 | ] 423 | }, 424 | { 425 | "cell_type": "code", 426 | "execution_count": null, 427 | "metadata": { 428 | "collapsed": true 429 | }, 430 | "outputs": [], 431 | "source": [ 432 | "np.std(X)" 433 | ] 434 | }, 435 | { 436 | "cell_type": "markdown", 437 | "metadata": {}, 438 | "source": [ 439 | "## Getting Real Pricing Data\n", 440 | "\n", 441 | "Randomly sampled data can be great for testing ideas, but let's get some real data. We can use `get_pricing` to do that. You can use the `?` syntax as discussed above to get more information on `get_pricing`'s arguments." 442 | ] 443 | }, 444 | { 445 | "cell_type": "markdown", 446 | "metadata": {}, 447 | "source": [ 448 | "### 获取真实的价格数据\n", 449 | "\n", 450 | "随机生成的数据能很好检验我们的想法,但是我们最终也是需要真实的数据来进行检测。我们将用`get_pricing`函数提取数据,可以使用刚才我们提到`?`命令去获取更多关于`get_pricing`函数的参数信息。\n", 451 | "\n", 452 | "- 译者注:请注意`get_pricing`为quantopain函数,本地使用相关函数是无效的。" 453 | ] 454 | }, 455 | { 456 | "cell_type": "code", 457 | "execution_count": null, 458 | "metadata": { 459 | "collapsed": true 460 | }, 461 | "outputs": [], 462 | "source": [ 463 | "# data = get_pricing('MSFT', start_date='2012-1-1', end_date='2015-6-1')" 464 | ] 465 | }, 466 | { 467 | "cell_type": "markdown", 468 | "metadata": {}, 469 | "source": [ 470 | "- 译者为大家提供一个本地的数据,以便完成后面的练习" 471 | ] 472 | }, 473 | { 474 | "cell_type": "code", 475 | "execution_count": null, 476 | "metadata": { 477 | "collapsed": true 478 | }, 479 | "outputs": [], 480 | "source": [ 481 | "file_name = r'data\\IF888-2011.mat'\n", 482 | "origin_data = loadmat(file_name)\n", 483 | "data = pd.DataFrame(origin_data['IF888'])\n", 484 | "data.columns = ['date', 'price']\n", 485 | "origin = np.datetime64('0000-01-01', 'D') - np.timedelta64(1, 'D')\n", 486 | "data['date'] = data['date'].map(lambda x: x * np.timedelta64(1, 'D') + origin)\n", 487 | "data.index = data['date'].tolist()\n", 488 | "data = data.iloc[:,1:]" 489 | ] 490 | }, 491 | { 492 | "cell_type": "markdown", 493 | "metadata": {}, 494 | "source": [ 495 | "Our data is now a dataframe. You can see the datetime index and the colums with different pricing data." 496 | ] 497 | }, 498 | { 499 | "cell_type": "markdown", 500 | "metadata": {}, 501 | "source": [ 502 | "`data`是的数据格类型是`pandas`中的`dataframe`。可以通过`index`和`colums`来查看不同维度下价格数据" 503 | ] 504 | }, 505 | { 506 | "cell_type": "markdown", 507 | "metadata": {}, 508 | "source": [ 509 | "This is a pandas dataframe, so we can index in to just get price like this. For more info on pandas, please [click here](http://pandas.pydata.org/pandas-docs/stable/10min.html)." 510 | ] 511 | }, 512 | { 513 | "cell_type": "markdown", 514 | "metadata": {}, 515 | "source": [ 516 | "这是一个pandas dataframe, 我们可以使用index去获取价格信息. 更多关于pandas的信息请[点击这里](http://pandas.pydata.org/pandas-docs/stable/10min.html)." 517 | ] 518 | }, 519 | { 520 | "cell_type": "code", 521 | "execution_count": null, 522 | "metadata": { 523 | "collapsed": true 524 | }, 525 | "outputs": [], 526 | "source": [ 527 | "X = data['price']" 528 | ] 529 | }, 530 | { 531 | "cell_type": "markdown", 532 | "metadata": {}, 533 | "source": [ 534 | "Because there is now also date information in our data, we provide two series to `.plot`. `X.index` gives us the datetime index, and `X.values` gives us the pricing values. These are used as the X and Y coordinates to make a graph." 535 | ] 536 | }, 537 | { 538 | "cell_type": "markdown", 539 | "metadata": {}, 540 | "source": [ 541 | "因为这里有日期信息在我们的数据中,我们将`X.index`得到的日期索引数据,和`X.values`得到的价格数据作为`.plot`X和Y轴数据用于画图。" 542 | ] 543 | }, 544 | { 545 | "cell_type": "code", 546 | "execution_count": null, 547 | "metadata": { 548 | "collapsed": true 549 | }, 550 | "outputs": [], 551 | "source": [ 552 | "plt.plot(X.index, X.values)\n", 553 | "plt.ylabel('Price')\n", 554 | "plt.legend(['IF888'])" 555 | ] 556 | }, 557 | { 558 | "cell_type": "markdown", 559 | "metadata": {}, 560 | "source": [ 561 | "我们在真实数据上来做点统计计算。" 562 | ] 563 | }, 564 | { 565 | "cell_type": "code", 566 | "execution_count": null, 567 | "metadata": { 568 | "collapsed": true 569 | }, 570 | "outputs": [], 571 | "source": [ 572 | "np.mean(X)" 573 | ] 574 | }, 575 | { 576 | "cell_type": "code", 577 | "execution_count": null, 578 | "metadata": { 579 | "collapsed": true 580 | }, 581 | "outputs": [], 582 | "source": [ 583 | "np.std(X)" 584 | ] 585 | }, 586 | { 587 | "cell_type": "markdown", 588 | "metadata": { 589 | "collapsed": true 590 | }, 591 | "source": [ 592 | "## Getting Returns from Prices\n", 593 | "\n", 594 | "We can use the `pct_change` function to get returns. Notice how we drop the first element after doing this, as it will be `NaN` (nothing -> something results in a NaN percent change)." 595 | ] 596 | }, 597 | { 598 | "cell_type": "markdown", 599 | "metadata": {}, 600 | "source": [ 601 | "## 根据价格数据计算回报\n", 602 | "我们用`pct_change`函数提取回报,注意:返回的第一个元素是`NaN`,所以我们会忽略掉他(因为第一个元素之前没有元素,无法比较,故返回一个NaN)" 603 | ] 604 | }, 605 | { 606 | "cell_type": "code", 607 | "execution_count": null, 608 | "metadata": { 609 | "collapsed": true 610 | }, 611 | "outputs": [], 612 | "source": [ 613 | "R = X.pct_change()[1:]" 614 | ] 615 | }, 616 | { 617 | "cell_type": "markdown", 618 | "metadata": {}, 619 | "source": [ 620 | "We can plot the returns distribution as a histogram." 621 | ] 622 | }, 623 | { 624 | "cell_type": "markdown", 625 | "metadata": {}, 626 | "source": [ 627 | "我们可以用一个直方图去描述回报分布" 628 | ] 629 | }, 630 | { 631 | "cell_type": "code", 632 | "execution_count": null, 633 | "metadata": { 634 | "collapsed": true 635 | }, 636 | "outputs": [], 637 | "source": [ 638 | "plt.hist(R, bins=20)\n", 639 | "plt.xlabel('Returns')\n", 640 | "plt.ylabel('Frequency')\n", 641 | "plt.grid(True)\n", 642 | "plt.legend(['IF888 Returns']);" 643 | ] 644 | }, 645 | { 646 | "cell_type": "markdown", 647 | "metadata": {}, 648 | "source": [ 649 | "Get statistics again." 650 | ] 651 | }, 652 | { 653 | "cell_type": "markdown", 654 | "metadata": {}, 655 | "source": [ 656 | "再次计算统计量" 657 | ] 658 | }, 659 | { 660 | "cell_type": "code", 661 | "execution_count": null, 662 | "metadata": { 663 | "collapsed": true 664 | }, 665 | "outputs": [], 666 | "source": [ 667 | "np.mean(R) # the same as R.mean()" 668 | ] 669 | }, 670 | { 671 | "cell_type": "code", 672 | "execution_count": null, 673 | "metadata": { 674 | "collapsed": true 675 | }, 676 | "outputs": [], 677 | "source": [ 678 | "R.std() # the same sa np.std(R)" 679 | ] 680 | }, 681 | { 682 | "cell_type": "markdown", 683 | "metadata": {}, 684 | "source": [ 685 | "Now let's go backwards and generate data out of a normal distribution using the statistics we estimated from Microsoft's returns. We'll see that we have good reason to suspect Microsoft's returns may not be normal, as the resulting normal distribution looks far different." 686 | ] 687 | }, 688 | { 689 | "cell_type": "markdown", 690 | "metadata": {}, 691 | "source": [ 692 | "现在我们用回报数据的均值和标准差作为一个正态分布的参数去生成数据,我们有理由相信`IF888`的回报并不服从正态分布,因为对比后发现,两者相去甚远。" 693 | ] 694 | }, 695 | { 696 | "cell_type": "code", 697 | "execution_count": null, 698 | "metadata": { 699 | "collapsed": true 700 | }, 701 | "outputs": [], 702 | "source": [ 703 | "np.random.seed(3)\n", 704 | "\n", 705 | "plt.hist(np.random.normal(np.mean(R), np.std(R), 10000), bins=20)\n", 706 | "plt.xlabel('Returns')\n", 707 | "plt.ylabel('Frequency')\n", 708 | "plt.grid(True)\n", 709 | "plt.legend(['Normal Distribution Returns'], loc='best');" 710 | ] 711 | }, 712 | { 713 | "cell_type": "markdown", 714 | "metadata": {}, 715 | "source": [ 716 | "## Generating a Moving Average\n", 717 | "\n", 718 | "`pandas` has some nice tools to allow us to generate rolling statistics. Here's an example. Notice how there's no moving average for the first 60 days, as we don't have 60 days of data on which to generate the statistic." 719 | ] 720 | }, 721 | { 722 | "cell_type": "markdown", 723 | "metadata": {}, 724 | "source": [ 725 | "## 生成一条移动均线\n", 726 | "`pandas`有非常优秀的工具能让我们生成rolling statistics.这里有个例子.注意我们是没有足够的数据去生成前60日均线数据的。 " 727 | ] 728 | }, 729 | { 730 | "cell_type": "code", 731 | "execution_count": null, 732 | "metadata": { 733 | "collapsed": true 734 | }, 735 | "outputs": [], 736 | "source": [ 737 | "# Take the average of the last 60 days at each timepoint.\n", 738 | "MAVG = X.rolling(window=60).mean()\n", 739 | "plt.plot(X.index, X.values)\n", 740 | "plt.plot(MAVG.index, MAVG.values)\n", 741 | "plt.ylabel('Price')\n", 742 | "plt.legend(['IF888', '60-day MAVG']);" 743 | ] 744 | } 745 | ], 746 | "metadata": { 747 | "kernelspec": { 748 | "display_name": "Python 3 tensorflow", 749 | "language": "python", 750 | "name": "python3" 751 | }, 752 | "language_info": { 753 | "codemirror_mode": { 754 | "name": "ipython", 755 | "version": 3 756 | }, 757 | "file_extension": ".py", 758 | "mimetype": "text/x-python", 759 | "name": "python", 760 | "nbconvert_exporter": "python", 761 | "pygments_lexer": "ipython3", 762 | "version": "3.6.4" 763 | }, 764 | "toc": { 765 | "nav_menu": {}, 766 | "number_sections": true, 767 | "sideBar": true, 768 | "skip_h1_title": false, 769 | "title_cell": "Table of Contents", 770 | "title_sidebar": "Contents", 771 | "toc_cell": false, 772 | "toc_position": {}, 773 | "toc_section_display": true, 774 | "toc_window_display": false 775 | }, 776 | "varInspector": { 777 | "cols": { 778 | "lenName": 16, 779 | "lenType": 16, 780 | "lenVar": 40 781 | }, 782 | "kernels_config": { 783 | "python": { 784 | "delete_cmd_postfix": "", 785 | "delete_cmd_prefix": "del ", 786 | "library": "var_list.py", 787 | "varRefreshCmd": "print(var_dic_list())" 788 | }, 789 | "r": { 790 | "delete_cmd_postfix": ") ", 791 | "delete_cmd_prefix": "rm(", 792 | "library": "var_list.r", 793 | "varRefreshCmd": "cat(var_dic_list()) " 794 | } 795 | }, 796 | "types_to_exclude": [ 797 | "module", 798 | "function", 799 | "builtin_function_or_method", 800 | "instance", 801 | "_Feature" 802 | ], 803 | "window_display": false 804 | } 805 | }, 806 | "nbformat": 4, 807 | "nbformat_minor": 2 808 | } 809 | -------------------------------------------------------------------------------- /01.Introduction to Research.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Introduction to the Research Environment\n", 8 | "\n", 9 | "The research environment is powered by IPython notebooks, which allow one to perform a great deal of data analysis and statistical validation. We'll demonstrate a few simple techniques here." 10 | ] 11 | }, 12 | { 13 | "cell_type": "markdown", 14 | "metadata": {}, 15 | "source": [ 16 | "# 研究环境的介绍\n", 17 | "这是一个基于jupyter notebook(原IPython notebook)的研究环境,能出色的完成大数据的分析和统计验证。我们将在这里做一些简单的技巧演示。" 18 | ] 19 | }, 20 | { 21 | "cell_type": "markdown", 22 | "metadata": {}, 23 | "source": [ 24 | "## Code Cells vs. Text Cells\n", 25 | "\n", 26 | "As you can see, each cell can be either code or text. To select between them, choose from the 'Cell Type' dropdown menu on the top left." 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "metadata": {}, 32 | "source": [ 33 | "## 代码单元格 vs 文本单元格\n", 34 | "如你所见,单元格可以是代码编辑模式,也可以是文本编辑模式。可用通过点击顶部菜单的`Cell`的下拉菜单中`Cell Type`来进行选择。\n", 35 | "亦可通过顶部菜单中下拉菜单进行选择:`Code`即为代码编辑模式,`Markdown`即为文本编辑模式。如下图所示:\n", 36 | "\n", 37 | "![image](image/cell_mode_change.jpg)\n", 38 | "\n", 39 | "在单元格未进入编辑时,可以通过快捷键进行切换,`Y`切换成代码模式,`M`切换成文本编辑模式,当单元格处于编辑状态时,可以按`Esc`退出编辑模式, 按`Enter`进入编辑模式。" 40 | ] 41 | }, 42 | { 43 | "cell_type": "markdown", 44 | "metadata": {}, 45 | "source": [ 46 | "## Executing a Command\n", 47 | "\n", 48 | "A code cell will be evaluated when you press play, or when you press the shortcut, shift-enter. Evaluating a cell evaluates each line of code in sequence, and prints the results of the last line below the cell." 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "metadata": {}, 54 | "source": [ 55 | "## 执行一个命令\n", 56 | "当处于`代码编辑`模式时,通过快捷键`Shift`+`Enter`来逐行执行单元格中代码,并会打印出单元格最后一行代码执行的结果。" 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": null, 62 | "metadata": { 63 | "collapsed": true 64 | }, 65 | "outputs": [], 66 | "source": [ 67 | "2 + 2" 68 | ] 69 | }, 70 | { 71 | "cell_type": "markdown", 72 | "metadata": {}, 73 | "source": [ 74 | "有时候并没有打印结果,比如说赋值的时候." 75 | ] 76 | }, 77 | { 78 | "cell_type": "code", 79 | "execution_count": null, 80 | "metadata": { 81 | "collapsed": true 82 | }, 83 | "outputs": [], 84 | "source": [ 85 | "X = 2" 86 | ] 87 | }, 88 | { 89 | "cell_type": "markdown", 90 | "metadata": {}, 91 | "source": [ 92 | "记住,只有最后一行的代码执行的结果会被打印出来." 93 | ] 94 | }, 95 | { 96 | "cell_type": "code", 97 | "execution_count": null, 98 | "metadata": { 99 | "collapsed": true 100 | }, 101 | "outputs": [], 102 | "source": [ 103 | "2 + 2\n", 104 | "3 + 3" 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "metadata": {}, 110 | "source": [ 111 | "你可以通过`print`来打印你想要的代码结果." 112 | ] 113 | }, 114 | { 115 | "cell_type": "code", 116 | "execution_count": null, 117 | "metadata": { 118 | "collapsed": true, 119 | "scrolled": false 120 | }, 121 | "outputs": [], 122 | "source": [ 123 | "print(2 + 2)\n", 124 | "3 + 3" 125 | ] 126 | }, 127 | { 128 | "cell_type": "markdown", 129 | "metadata": {}, 130 | "source": [ 131 | "## Knowing When a Cell is Running\n", 132 | "\n", 133 | "While a cell is running, a `[*]` will dsiplay on the left. When a cell has yet to be executed, `[ ]` will display. When it has been run, a number will display indicating the order in which it was run during the execution of the notebook `[5]`. Try on this cell and note it happening." 134 | ] 135 | }, 136 | { 137 | "cell_type": "markdown", 138 | "metadata": {}, 139 | "source": [ 140 | "## 单元格的状态\n", 141 | "当一个单元格左侧的标记为`[*]`时,表明程序正在运行.`[ ]`表明单元格还未执行.当一个单元格运行完毕后,会在这对方括号中加上一个数字,表示已完成的状态,比如`[5]`.\n", 142 | "- 特别的说明:当你重新打开notebook,或者重启kernel(并未清理结果)之后,单元格左侧的方框虽然有数字,但其实并未运行." 143 | ] 144 | }, 145 | { 146 | "cell_type": "code", 147 | "execution_count": null, 148 | "metadata": {}, 149 | "outputs": [], 150 | "source": [ 151 | "# 花点时间去运行一段代码\n", 152 | "c = 0\n", 153 | "for i in range(10000000):\n", 154 | " c = c + i\n", 155 | "c" 156 | ] 157 | }, 158 | { 159 | "cell_type": "markdown", 160 | "metadata": {}, 161 | "source": [ 162 | "## Importing Libraries\n", 163 | "\n", 164 | "The vast majority of the time, you'll want to use functions from pre-built libraries. You can't import every library on Quantopian due to security issues, but you can import most of the common scientific ones. Here I import numpy and pandas, the two most common and useful libraries in quant finance. I recommend copying this import statement to every new notebook.\n", 165 | "\n", 166 | "Notice that you can rename libraries to whatever you want after importing. The `as` statement allows this. Here we use `np` and `pd` as aliases for `numpy` and `pandas`. This is a very common aliasing and will be found in most code snippets around the web. The point behind this is to allow you to type fewer characters when you are frequently accessing these libraries." 167 | ] 168 | }, 169 | { 170 | "cell_type": "markdown", 171 | "metadata": {}, 172 | "source": [ 173 | "## 导入库\n", 174 | "绝大多数时候,我们需要使用预构建库(pre-built libraries)中的函数。出于安全考虑(基于平台的角度),我们无法导入Quantopian上的每个库,但是可以导入最常见,用于科学计算的库。这里我将会导入`numpy`和`pandas`,两个在量化金融中常见且实用的库。\n", 175 | "我建议将这个导入代码复制到每一个新的notebook.\n", 176 | "\n", 177 | "你可以使用`as`关键字将导入的库重命名,我们将使用 `np`和`pd`分别作为`numpy`和`pandas`的别名,这算是全世界大家公认的别名。使用别名的好处就是当你频繁使用者这些库时,能输入较少的字符。" 178 | ] 179 | }, 180 | { 181 | "cell_type": "code", 182 | "execution_count": null, 183 | "metadata": { 184 | "collapsed": true 185 | }, 186 | "outputs": [], 187 | "source": [ 188 | "import numpy as np\n", 189 | "import pandas as pd\n", 190 | "from scipy.io import loadmat\n", 191 | "# 这是一个在画图方面非常优秀的库\n", 192 | "import matplotlib.pyplot as plt\n", 193 | "import datetime\n", 194 | "\n", 195 | "%matplotlib inline" 196 | ] 197 | }, 198 | { 199 | "cell_type": "markdown", 200 | "metadata": {}, 201 | "source": [ 202 | "## Tab Autocomplete\n", 203 | "\n", 204 | "Pressing tab will give you a list of IPython's best guesses for what you might want to type next. This is incredibly valuable and will save you a lot of time. If there is only one possible option for what you could type next, IPython will fill that in for you. Try pressing tab very frequently, it will seldom fill in anything you don't want, as if there is ambiguity a list will be shown. This is a great way to see what functions are available in a library.\n", 205 | "\n", 206 | "Try placing your cursor after the `.` and pressing tab." 207 | ] 208 | }, 209 | { 210 | "cell_type": "markdown", 211 | "metadata": {}, 212 | "source": [ 213 | "## Tab键 代码自动补全\n", 214 | "当你按下`Tab`键时,会自动出现一个猜测你下一步,你可能输入内容的列表。这个功能很实用而且能省不少事。如果只有一个可能选项,IPython则会自动帮助你完成输入。请频繁的按下`Tab`吧,它几乎不会出现你不想要的内容,一如这里有歧义的话,就会出现一个列表。想看某个库中有哪些函数,这是一个不错的方式。\n", 215 | "\n", 216 | "请将鼠标的置于`np.random.`中最后一个`.`之后,并按下`Tab`。" 217 | ] 218 | }, 219 | { 220 | "cell_type": "code", 221 | "execution_count": null, 222 | "metadata": { 223 | "collapsed": true 224 | }, 225 | "outputs": [], 226 | "source": [ 227 | "np.random." 228 | ] 229 | }, 230 | { 231 | "cell_type": "markdown", 232 | "metadata": {}, 233 | "source": [ 234 | "## Getting Documentation Help\n", 235 | "\n", 236 | "Placing a question mark after a function and executing that line of code will give you the documentation IPython has for that function. It's often best to do this in a new cell, as you avoid re-executing other code and running into bugs." 237 | ] 238 | }, 239 | { 240 | "cell_type": "markdown", 241 | "metadata": {}, 242 | "source": [ 243 | "## 获取文档帮助\n", 244 | "在IPython中,在一个函数后面加上`?`并执行该代码,就会获得该函数的文本帮助。建议通常在一个新的单元格执行,这样能避免重新执行其他代码或者发生bug." 245 | ] 246 | }, 247 | { 248 | "cell_type": "code", 249 | "execution_count": null, 250 | "metadata": { 251 | "collapsed": true 252 | }, 253 | "outputs": [], 254 | "source": [ 255 | "np.random.normal?" 256 | ] 257 | }, 258 | { 259 | "cell_type": "markdown", 260 | "metadata": {}, 261 | "source": [ 262 | "## Sampling\n", 263 | "\n", 264 | "We'll sample some random data using a function from `numpy`." 265 | ] 266 | }, 267 | { 268 | "cell_type": "markdown", 269 | "metadata": {}, 270 | "source": [ 271 | "## 生成样本\n", 272 | "我们将用`numpy`中的函数随机生成一些数据" 273 | ] 274 | }, 275 | { 276 | "cell_type": "markdown", 277 | "metadata": {}, 278 | "source": [ 279 | "- 译者注:通常计算机中的随机函数,都是伪随机,在`numpy`中,我们通过使用`np.random.seed()`函数来控制随机结果" 280 | ] 281 | }, 282 | { 283 | "cell_type": "code", 284 | "execution_count": null, 285 | "metadata": { 286 | "collapsed": true 287 | }, 288 | "outputs": [], 289 | "source": [ 290 | "# Sample 100 points with a mean of 0 and an std of 1. This is a standard normal distribution.\n", 291 | "# 随机生成100样本点,这些点服从均值为0,标准差为1的正态分布。\n", 292 | "np.random.seed(1) # 确保大家生成的样本保持一致\n", 293 | "\n", 294 | "X = np.random.normal(0, 1, 100)" 295 | ] 296 | }, 297 | { 298 | "cell_type": "markdown", 299 | "metadata": {}, 300 | "source": [ 301 | "## Plotting\n", 302 | "\n", 303 | "We can use the plotting library we imported as follows." 304 | ] 305 | }, 306 | { 307 | "cell_type": "markdown", 308 | "metadata": {}, 309 | "source": [ 310 | "## 画图\n", 311 | "我们使用刚才导入的库进行画图\n", 312 | "\n", 313 | "`import matplotlib.pyplot as plt`" 314 | ] 315 | }, 316 | { 317 | "cell_type": "code", 318 | "execution_count": null, 319 | "metadata": { 320 | "collapsed": true 321 | }, 322 | "outputs": [], 323 | "source": [ 324 | "plt.plot(X)" 325 | ] 326 | }, 327 | { 328 | "cell_type": "markdown", 329 | "metadata": {}, 330 | "source": [ 331 | "### Squelching Line Output\n", 332 | "\n", 333 | "You might have noticed the annoying line of the form `[]` before the plots. This is because the `.plot` function actually produces output. Sometimes we wish not to display output, we can accomplish this with the semi-colon as follows." 334 | ] 335 | }, 336 | { 337 | "cell_type": "markdown", 338 | "metadata": {}, 339 | "source": [ 340 | "### 让输出更纯净\n", 341 | "你可能注意到我们画的图前面有一行非常讨厌的输出:`[]`\n", 342 | "\n", 343 | "这是`.plot`函数输出结果,有时候我们并不希望他出现,我们可以加上`;`实现这个功能;" 344 | ] 345 | }, 346 | { 347 | "cell_type": "code", 348 | "execution_count": null, 349 | "metadata": { 350 | "collapsed": true 351 | }, 352 | "outputs": [], 353 | "source": [ 354 | "plt.plot(X);" 355 | ] 356 | }, 357 | { 358 | "cell_type": "markdown", 359 | "metadata": {}, 360 | "source": [ 361 | "### Adding Axis Labels\n", 362 | "\n", 363 | "No self-respecting quant leaves a graph without labeled axes. Here are some commands to help with that." 364 | ] 365 | }, 366 | { 367 | "cell_type": "markdown", 368 | "metadata": {}, 369 | "source": [ 370 | "### 添加坐标轴标签\n", 371 | "有追求的宽客是不会让图表没有坐标轴标签的。\n", 372 | "\n", 373 | "这里有一些命令可以帮我们实现他。" 374 | ] 375 | }, 376 | { 377 | "cell_type": "code", 378 | "execution_count": null, 379 | "metadata": { 380 | "collapsed": true 381 | }, 382 | "outputs": [], 383 | "source": [ 384 | "np.random.seed(2)\n", 385 | "\n", 386 | "X = np.random.normal(0, 1, 100)\n", 387 | "X2 = np.random.normal(0, 1, 100)\n", 388 | "\n", 389 | "plt.plot(X);\n", 390 | "plt.plot(X2);\n", 391 | "plt.xlabel('Time') # 我们生成的数据是没有单位的,但不要忘记他的单位。\n", 392 | "plt.ylabel('Returns')\n", 393 | "plt.legend(['X', 'X2']);" 394 | ] 395 | }, 396 | { 397 | "cell_type": "markdown", 398 | "metadata": {}, 399 | "source": [ 400 | "## Generating Statistics\n", 401 | "\n", 402 | "Let's use `numpy` to take some simple statistics." 403 | ] 404 | }, 405 | { 406 | "cell_type": "markdown", 407 | "metadata": {}, 408 | "source": [ 409 | "## 计算统计值\n", 410 | "让我们用`numpy`来做一些简单的统计计算。" 411 | ] 412 | }, 413 | { 414 | "cell_type": "code", 415 | "execution_count": null, 416 | "metadata": { 417 | "collapsed": true 418 | }, 419 | "outputs": [], 420 | "source": [ 421 | "np.mean(X)" 422 | ] 423 | }, 424 | { 425 | "cell_type": "code", 426 | "execution_count": null, 427 | "metadata": { 428 | "collapsed": true 429 | }, 430 | "outputs": [], 431 | "source": [ 432 | "np.std(X)" 433 | ] 434 | }, 435 | { 436 | "cell_type": "markdown", 437 | "metadata": {}, 438 | "source": [ 439 | "## Getting Real Pricing Data\n", 440 | "\n", 441 | "Randomly sampled data can be great for testing ideas, but let's get some real data. We can use `get_pricing` to do that. You can use the `?` syntax as discussed above to get more information on `get_pricing`'s arguments." 442 | ] 443 | }, 444 | { 445 | "cell_type": "markdown", 446 | "metadata": {}, 447 | "source": [ 448 | "### 获取真实的价格数据\n", 449 | "\n", 450 | "随机生成的数据能很好检验我们的想法,但是我们最终也是需要真实的数据来进行检测。我们将用`get_pricing`函数提取数据,可以使用刚才我们提到`?`命令去获取更多关于`get_pricing`函数的参数信息。\n", 451 | "\n", 452 | "- 译者注:请注意`get_pricing`为quantopain函数,本地使用相关函数是无效的。" 453 | ] 454 | }, 455 | { 456 | "cell_type": "code", 457 | "execution_count": null, 458 | "metadata": { 459 | "collapsed": true 460 | }, 461 | "outputs": [], 462 | "source": [ 463 | "# data = get_pricing('MSFT', start_date='2012-1-1', end_date='2015-6-1')" 464 | ] 465 | }, 466 | { 467 | "cell_type": "markdown", 468 | "metadata": {}, 469 | "source": [ 470 | "- 译者为大家提供一个本地的数据,以便完成后面的练习" 471 | ] 472 | }, 473 | { 474 | "cell_type": "code", 475 | "execution_count": null, 476 | "metadata": { 477 | "collapsed": true 478 | }, 479 | "outputs": [], 480 | "source": [ 481 | "file_name = r'data\\IF888-2011.mat'\n", 482 | "origin_data = loadmat(file_name)\n", 483 | "data = pd.DataFrame(origin_data['IF888'])\n", 484 | "data.columns = ['date', 'price']\n", 485 | "origin = np.datetime64('0000-01-01', 'D') - np.timedelta64(1, 'D')\n", 486 | "data['date'] = data['date'].map(lambda x: x * np.timedelta64(1, 'D') + origin)\n", 487 | "data.index = data['date'].tolist()\n", 488 | "data = data.iloc[:,1:]" 489 | ] 490 | }, 491 | { 492 | "cell_type": "markdown", 493 | "metadata": {}, 494 | "source": [ 495 | "Our data is now a dataframe. You can see the datetime index and the colums with different pricing data." 496 | ] 497 | }, 498 | { 499 | "cell_type": "markdown", 500 | "metadata": {}, 501 | "source": [ 502 | "`data`是的数据格类型是`pandas`中的`dataframe`。可以通过`index`和`colums`来查看不同维度下价格数据" 503 | ] 504 | }, 505 | { 506 | "cell_type": "markdown", 507 | "metadata": {}, 508 | "source": [ 509 | "This is a pandas dataframe, so we can index in to just get price like this. For more info on pandas, please [click here](http://pandas.pydata.org/pandas-docs/stable/10min.html)." 510 | ] 511 | }, 512 | { 513 | "cell_type": "markdown", 514 | "metadata": {}, 515 | "source": [ 516 | "这是一个pandas dataframe, 我们可以使用index去获取价格信息. 更多关于pandas的信息请[点击这里](http://pandas.pydata.org/pandas-docs/stable/10min.html)." 517 | ] 518 | }, 519 | { 520 | "cell_type": "code", 521 | "execution_count": null, 522 | "metadata": { 523 | "collapsed": true 524 | }, 525 | "outputs": [], 526 | "source": [ 527 | "X = data['price']" 528 | ] 529 | }, 530 | { 531 | "cell_type": "markdown", 532 | "metadata": {}, 533 | "source": [ 534 | "Because there is now also date information in our data, we provide two series to `.plot`. `X.index` gives us the datetime index, and `X.values` gives us the pricing values. These are used as the X and Y coordinates to make a graph." 535 | ] 536 | }, 537 | { 538 | "cell_type": "markdown", 539 | "metadata": {}, 540 | "source": [ 541 | "因为这里有日期信息在我们的数据中,我们将`X.index`得到的日期索引数据,和`X.values`得到的价格数据作为`.plot`X和Y轴数据用于画图。" 542 | ] 543 | }, 544 | { 545 | "cell_type": "code", 546 | "execution_count": null, 547 | "metadata": { 548 | "collapsed": true 549 | }, 550 | "outputs": [], 551 | "source": [ 552 | "plt.plot(X.index, X.values)\n", 553 | "plt.ylabel('Price')\n", 554 | "plt.legend(['IF888'])" 555 | ] 556 | }, 557 | { 558 | "cell_type": "markdown", 559 | "metadata": {}, 560 | "source": [ 561 | "我们在真实数据上来做点统计计算。" 562 | ] 563 | }, 564 | { 565 | "cell_type": "code", 566 | "execution_count": null, 567 | "metadata": { 568 | "collapsed": true 569 | }, 570 | "outputs": [], 571 | "source": [ 572 | "np.mean(X)" 573 | ] 574 | }, 575 | { 576 | "cell_type": "code", 577 | "execution_count": null, 578 | "metadata": { 579 | "collapsed": true 580 | }, 581 | "outputs": [], 582 | "source": [ 583 | "np.std(X)" 584 | ] 585 | }, 586 | { 587 | "cell_type": "markdown", 588 | "metadata": { 589 | "collapsed": true 590 | }, 591 | "source": [ 592 | "## Getting Returns from Prices\n", 593 | "\n", 594 | "We can use the `pct_change` function to get returns. Notice how we drop the first element after doing this, as it will be `NaN` (nothing -> something results in a NaN percent change)." 595 | ] 596 | }, 597 | { 598 | "cell_type": "markdown", 599 | "metadata": {}, 600 | "source": [ 601 | "## 根据价格数据计算回报\n", 602 | "我们用`pct_change`函数提取回报,注意:返回的第一个元素是`NaN`,所以我们会忽略掉他(因为第一个元素之前没有元素,无法比较,故返回一个NaN)" 603 | ] 604 | }, 605 | { 606 | "cell_type": "code", 607 | "execution_count": null, 608 | "metadata": { 609 | "collapsed": true 610 | }, 611 | "outputs": [], 612 | "source": [ 613 | "R = X.pct_change()[1:]" 614 | ] 615 | }, 616 | { 617 | "cell_type": "markdown", 618 | "metadata": {}, 619 | "source": [ 620 | "We can plot the returns distribution as a histogram." 621 | ] 622 | }, 623 | { 624 | "cell_type": "markdown", 625 | "metadata": {}, 626 | "source": [ 627 | "我们可以用一个直方图去描述回报分布" 628 | ] 629 | }, 630 | { 631 | "cell_type": "code", 632 | "execution_count": null, 633 | "metadata": { 634 | "collapsed": true 635 | }, 636 | "outputs": [], 637 | "source": [ 638 | "plt.hist(R, bins=20)\n", 639 | "plt.xlabel('Returns')\n", 640 | "plt.ylabel('Frequency')\n", 641 | "plt.grid(True)\n", 642 | "plt.legend(['IF888 Returns']);" 643 | ] 644 | }, 645 | { 646 | "cell_type": "markdown", 647 | "metadata": {}, 648 | "source": [ 649 | "Get statistics again." 650 | ] 651 | }, 652 | { 653 | "cell_type": "markdown", 654 | "metadata": {}, 655 | "source": [ 656 | "再次计算统计量" 657 | ] 658 | }, 659 | { 660 | "cell_type": "code", 661 | "execution_count": null, 662 | "metadata": { 663 | "collapsed": true 664 | }, 665 | "outputs": [], 666 | "source": [ 667 | "np.mean(R) # the same as R.mean()" 668 | ] 669 | }, 670 | { 671 | "cell_type": "code", 672 | "execution_count": null, 673 | "metadata": { 674 | "collapsed": true 675 | }, 676 | "outputs": [], 677 | "source": [ 678 | "R.std() # the same sa np.std(R)" 679 | ] 680 | }, 681 | { 682 | "cell_type": "markdown", 683 | "metadata": {}, 684 | "source": [ 685 | "Now let's go backwards and generate data out of a normal distribution using the statistics we estimated from Microsoft's returns. We'll see that we have good reason to suspect Microsoft's returns may not be normal, as the resulting normal distribution looks far different." 686 | ] 687 | }, 688 | { 689 | "cell_type": "markdown", 690 | "metadata": {}, 691 | "source": [ 692 | "现在我们用回报数据的均值和标准差作为一个正态分布的参数去生成数据,我们有理由相信`IF888`的回报并不服从正态分布,因为对比后发现,两者相去甚远。" 693 | ] 694 | }, 695 | { 696 | "cell_type": "code", 697 | "execution_count": null, 698 | "metadata": { 699 | "collapsed": true 700 | }, 701 | "outputs": [], 702 | "source": [ 703 | "np.random.seed(3)\n", 704 | "\n", 705 | "plt.hist(np.random.normal(np.mean(R), np.std(R), 10000), bins=20)\n", 706 | "plt.xlabel('Returns')\n", 707 | "plt.ylabel('Frequency')\n", 708 | "plt.grid(True)\n", 709 | "plt.legend(['Normal Distribution Returns'], loc='best');" 710 | ] 711 | }, 712 | { 713 | "cell_type": "markdown", 714 | "metadata": {}, 715 | "source": [ 716 | "## Generating a Moving Average\n", 717 | "\n", 718 | "`pandas` has some nice tools to allow us to generate rolling statistics. Here's an example. Notice how there's no moving average for the first 60 days, as we don't have 60 days of data on which to generate the statistic." 719 | ] 720 | }, 721 | { 722 | "cell_type": "markdown", 723 | "metadata": {}, 724 | "source": [ 725 | "## 生成一条移动均线\n", 726 | "`pandas`有非常优秀的工具能让我们生成rolling statistics.这里有个例子.注意我们是没有足够的数据去生成前60日均线数据的。 " 727 | ] 728 | }, 729 | { 730 | "cell_type": "code", 731 | "execution_count": null, 732 | "metadata": { 733 | "collapsed": true 734 | }, 735 | "outputs": [], 736 | "source": [ 737 | "# Take the average of the last 60 days at each timepoint.\n", 738 | "MAVG = X.rolling(window=60).mean()\n", 739 | "plt.plot(X.index, X.values)\n", 740 | "plt.plot(MAVG.index, MAVG.values)\n", 741 | "plt.ylabel('Price')\n", 742 | "plt.legend(['IF888', '60-day MAVG']);" 743 | ] 744 | } 745 | ], 746 | "metadata": { 747 | "kernelspec": { 748 | "display_name": "Python 3 tensorflow", 749 | "language": "python", 750 | "name": "python3" 751 | }, 752 | "language_info": { 753 | "codemirror_mode": { 754 | "name": "ipython", 755 | "version": 3 756 | }, 757 | "file_extension": ".py", 758 | "mimetype": "text/x-python", 759 | "name": "python", 760 | "nbconvert_exporter": "python", 761 | "pygments_lexer": "ipython3", 762 | "version": "3.6.4" 763 | }, 764 | "toc": { 765 | "nav_menu": {}, 766 | "number_sections": true, 767 | "sideBar": true, 768 | "skip_h1_title": false, 769 | "title_cell": "Table of Contents", 770 | "title_sidebar": "Contents", 771 | "toc_cell": false, 772 | "toc_position": {}, 773 | "toc_section_display": true, 774 | "toc_window_display": false 775 | }, 776 | "varInspector": { 777 | "cols": { 778 | "lenName": 16, 779 | "lenType": 16, 780 | "lenVar": 40 781 | }, 782 | "kernels_config": { 783 | "python": { 784 | "delete_cmd_postfix": "", 785 | "delete_cmd_prefix": "del ", 786 | "library": "var_list.py", 787 | "varRefreshCmd": "print(var_dic_list())" 788 | }, 789 | "r": { 790 | "delete_cmd_postfix": ") ", 791 | "delete_cmd_prefix": "rm(", 792 | "library": "var_list.r", 793 | "varRefreshCmd": "cat(var_dic_list()) " 794 | } 795 | }, 796 | "types_to_exclude": [ 797 | "module", 798 | "function", 799 | "builtin_function_or_method", 800 | "instance", 801 | "_Feature" 802 | ], 803 | "window_display": false 804 | } 805 | }, 806 | "nbformat": 4, 807 | "nbformat_minor": 2 808 | } 809 | -------------------------------------------------------------------------------- /02.Introduction to Python.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Introduction to Python\n", 8 | "by Maxwell Margenot\n", 9 | "\n", 10 | "Part of the Quantopian Lecture Series:\n", 11 | "\n", 12 | "* [www.quantopian.com/lectures](https://www.quantopian.com/lectures)\n", 13 | "* [github.com/quantopian/research_public](https://github.com/quantopian/research_public)\n", 14 | "\n", 15 | "Notebook released under the Creative Commons Attribution 4.0 License.\n", 16 | "\n", 17 | "---\n", 18 | "\n", 19 | "All of the coding that you will do on the Quantopian platform will be in Python. It is also just a good, jack-of-all-trades language to know! Here we will provide you with the basics so that you can feel confident going through our other lectures and understanding what is happening." 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": {}, 25 | "source": [ 26 | "# Python入门\n", 27 | "by Maxwell Margenot\n", 28 | "\n", 29 | "Part of the Quantopian Lecture Series:\n", 30 | "\n", 31 | "* [www.quantopian.com/lectures](https://www.quantopian.com/lectures)\n", 32 | "* [github.com/quantopian/research_public](https://github.com/quantopian/research_public)\n", 33 | "\n", 34 | "Notebook released under the Creative Commons Attribution 4.0 License.\n", 35 | "\n", 36 | "---\n", 37 | "\n", 38 | "你将在Quantopain平台的python环境中执行所有的代码.Python是一门十分优秀、几乎万能的编程语言.这里将会传授一些基础知识,以便大家能在接下来的课程中更加有信心,也能明白课程中在讲什么." 39 | ] 40 | }, 41 | { 42 | "cell_type": "markdown", 43 | "metadata": {}, 44 | "source": [ 45 | "## Code Comments\n", 46 | "\n", 47 | "A comment is a note made by a programmer in the source code of a program. Its purpose is to clarify the source code and make it easier for people to follow along with what is happening. Anything in a comment is generally ignored when the code is actually run, making comments useful for including explanations and reasoning as well as removing specific lines of code that you may be unsure about. Comments in Python are created by using the pound symbol (`# Insert Text Here`). Including a `#` in a line of code will comment out anything that follows it." 48 | ] 49 | }, 50 | { 51 | "cell_type": "markdown", 52 | "metadata": {}, 53 | "source": [ 54 | "## 代码注释\n", 55 | "\n", 56 | "一个代码注释指的是程序员在源代码中插入的一个笔记.是对源代码的说明,让人更容易明白接下来会发生什么.当代码实际运行时,注释内容是被忽略的,做注释能让代码更具可读性和有条理,当不确定是否删除某些代码行时,注释会有帮助.行注释在Pyhon中以一个`#`开头,后面接你想要输入任何文字. (`# 在这里插入文本`)" 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": null, 62 | "metadata": { 63 | "collapsed": true 64 | }, 65 | "outputs": [], 66 | "source": [ 67 | "# This is a comment\n", 68 | "# These lines of code will not change any values\n", 69 | "# Anything following the first # is not run as code" 70 | ] 71 | }, 72 | { 73 | "cell_type": "markdown", 74 | "metadata": {}, 75 | "source": [ 76 | "You may hear text enclosed in triple quotes (`\"\"\" Insert Text Here \"\"\"`) referred to as multi-line comments, but this is not entirely accurate. This is a special type of `string` (a data type we will cover), called a `docstring`, used to explain the purpose of a function." 77 | ] 78 | }, 79 | { 80 | "cell_type": "markdown", 81 | "metadata": {}, 82 | "source": [ 83 | "你可能听过用三对双引号`(\"\"\" 在这里插入文本 \"\"\")`来表示多行的注释,这个说法不完全准确,这是一种特殊的`string`(后面会讲到这种数据类型)数据类型,叫做`docstring`,用来解释函数的用法." 84 | ] 85 | }, 86 | { 87 | "cell_type": "code", 88 | "execution_count": null, 89 | "metadata": { 90 | "collapsed": true 91 | }, 92 | "outputs": [], 93 | "source": [ 94 | "\"\"\" This is a special string \"\"\"" 95 | ] 96 | }, 97 | { 98 | "cell_type": "markdown", 99 | "metadata": {}, 100 | "source": [ 101 | "Make sure you read the comments within each code cell (if they are there). They will provide more real-time explanations of what is going on as you look at each line of code." 102 | ] 103 | }, 104 | { 105 | "cell_type": "markdown", 106 | "metadata": {}, 107 | "source": [ 108 | "请确保阅读每个`code cell`中的注释(如果有的话).他们会提供代码实时的说明." 109 | ] 110 | }, 111 | { 112 | "cell_type": "markdown", 113 | "metadata": {}, 114 | "source": [ 115 | "## Variables\n", 116 | "\n", 117 | "Variables provide names for values in programming. If you want to save a value for later or repeated use, you give the value a name, storing the contents in a variable. Variables in programming work in a fundamentally similar way to variables in algebra, but in Python they can take on various different data types.\n", 118 | "\n", 119 | "The basic variable types that we will cover in this section are `integers`, `floating point numbers`, `booleans`, and `strings`. \n", 120 | "\n", 121 | "An `integer` in programming is the same as in mathematics, a round number with no values after the decimal point. We use the built-in `print` function here to display the values of our variables as well as their types!" 122 | ] 123 | }, 124 | { 125 | "cell_type": "markdown", 126 | "metadata": {}, 127 | "source": [ 128 | "## 变量\n", 129 | "\n", 130 | "变量为编程中的值提供名字.如果你想把某个值保存下来以供稍后或重复使用,你可以给这个值起一个名字,并将其内容储存在变量中.程序中的变量和代数中的变量很相似,在Python中,他们可以是各种不同的数据类型.\n", 131 | "\n", 132 | "在接下来的章节我们会介绍一些基础的变量类型,包括`integers(整数)`,`float point number(浮点数)`, `booleans(布尔值)`, 和 `strings(字符串)`.\n", 133 | "\n", 134 | "编程中的`integer` 意义和数学里一样, 表示一个没有小数点的数.我们用内置函数`print`来展示变量的值及变量类型!" 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": null, 140 | "metadata": { 141 | "collapsed": true 142 | }, 143 | "outputs": [], 144 | "source": [ 145 | "my_integer = 50\n", 146 | "print(my_integer, type(my_integer))" 147 | ] 148 | }, 149 | { 150 | "cell_type": "markdown", 151 | "metadata": {}, 152 | "source": [ 153 | "Variables, regardless of type, are assigned by using a single equals sign (`=`). Variables are case-sensitive so any changes in variation in the capitals of a variable name will reference a different variable entirely." 154 | ] 155 | }, 156 | { 157 | "cell_type": "markdown", 158 | "metadata": {}, 159 | "source": [ 160 | "变量,可以无视数据类型只用一个等号(`=`)来表示赋值.变量名对大小写敏感,变量名称的任何大小写的变化都将使他变成另外一个变量." 161 | ] 162 | }, 163 | { 164 | "cell_type": "code", 165 | "execution_count": null, 166 | "metadata": { 167 | "collapsed": true 168 | }, 169 | "outputs": [], 170 | "source": [ 171 | "one = 1\n", 172 | "print(one)" 173 | ] 174 | }, 175 | { 176 | "cell_type": "markdown", 177 | "metadata": {}, 178 | "source": [ 179 | "A `floating point` number, or a `float` is a fancy name for a real number (again as in mathematics). To define a `float`, we need to either include a decimal point or specify that the value is a float." 180 | ] 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "metadata": {}, 185 | "source": [ 186 | "一个浮点数,其实是实数的\"花名\"(这里的含义和数学中一样).定义一个浮点数,只需要加上一个小数点,或者指定类型为浮点数." 187 | ] 188 | }, 189 | { 190 | "cell_type": "code", 191 | "execution_count": null, 192 | "metadata": { 193 | "collapsed": true, 194 | "scrolled": true 195 | }, 196 | "outputs": [], 197 | "source": [ 198 | "my_float = 1.0\n", 199 | "print(my_float, type(my_float))\n", 200 | "my_float = float(1)\n", 201 | "print(my_float, type(my_float))" 202 | ] 203 | }, 204 | { 205 | "cell_type": "markdown", 206 | "metadata": {}, 207 | "source": [ 208 | "A variable of type `float` will not round the number that you store in it, while a variable of type `integer` will. This makes `floats` more suitable for mathematical calculations where you want more than just integers.\n", 209 | "\n", 210 | "Note that as we used the `float()` function to force an number to be considered a `float`, we can use the `int()` function to force a number to be considered an `int`." 211 | ] 212 | }, 213 | { 214 | "cell_type": "markdown", 215 | "metadata": {}, 216 | "source": [ 217 | "`float`类型并不会对你存储的数字进行舍入,`integer`类型则会.相较于`int`类型,`float`类型更适合科学计算.\n", 218 | "\n", 219 | "注意:我们可以使用`float()` 函数强制将数字转成一个`float`,亦可用`int()`函数强制将数字转换成`int`." 220 | ] 221 | }, 222 | { 223 | "cell_type": "code", 224 | "execution_count": null, 225 | "metadata": { 226 | "collapsed": true 227 | }, 228 | "outputs": [], 229 | "source": [ 230 | "my_int = int(3.14159)\n", 231 | "print(my_int, type(my_int))" 232 | ] 233 | }, 234 | { 235 | "cell_type": "markdown", 236 | "metadata": {}, 237 | "source": [ 238 | "The `int()` function will also truncate any digits that a number may have after the decimal point!\n", 239 | "\n", 240 | "Strings allow you to include text as a variable to operate on. They are defined using either single quotes ('') or double quotes (\"\")." 241 | ] 242 | }, 243 | { 244 | "cell_type": "markdown", 245 | "metadata": {}, 246 | "source": [ 247 | "`int()`会将小数点后所有的内容舍弃掉!\n", 248 | "\n", 249 | "字符串数据类型允许你将一段文本作为变量进行操作.我们用单引号 ('') 或者双引号(\"\")来表示." 250 | ] 251 | }, 252 | { 253 | "cell_type": "code", 254 | "execution_count": null, 255 | "metadata": { 256 | "collapsed": true 257 | }, 258 | "outputs": [], 259 | "source": [ 260 | "my_string = 'This is a string with single quotes'\n", 261 | "print(my_string)\n", 262 | "my_string = \"This is a string with double quotes\"\n", 263 | "print(my_string)" 264 | ] 265 | }, 266 | { 267 | "cell_type": "markdown", 268 | "metadata": {}, 269 | "source": [ 270 | "Both are allowed so that we can include apostrophes or quotation marks in a string if we so choose." 271 | ] 272 | }, 273 | { 274 | "cell_type": "markdown", 275 | "metadata": {}, 276 | "source": [ 277 | "两种方式都是允许的,这是为了能让我们灵活的在字符串中插入单引号或者双引号." 278 | ] 279 | }, 280 | { 281 | "cell_type": "code", 282 | "execution_count": null, 283 | "metadata": { 284 | "collapsed": true 285 | }, 286 | "outputs": [], 287 | "source": [ 288 | "my_string = '\"Jabberwocky\", by Lewis Carroll'\n", 289 | "print(my_string)\n", 290 | "my_string = \"'Twas brillig, and the slithy toves / Did gyre and gimble in the wabe;\"\n", 291 | "print(my_string)" 292 | ] 293 | }, 294 | { 295 | "cell_type": "markdown", 296 | "metadata": {}, 297 | "source": [ 298 | "Booleans, or `bools` are binary variable types. A `bool` can only take on one of two values, these being `True` or `False`. There is much more to this idea of truth values when it comes to programming, which we cover later in the [Logical Operators](#id-section5) of this notebook." 299 | ] 300 | }, 301 | { 302 | "cell_type": "markdown", 303 | "metadata": {}, 304 | "source": [ 305 | "布尔值是一种二进制变量类型.一个`bool`型变量的值只能是`True`或者`False`.在接下来的[Logical Operators](#id-section5)章节中会介绍更多关于真值的概念." 306 | ] 307 | }, 308 | { 309 | "cell_type": "code", 310 | "execution_count": null, 311 | "metadata": { 312 | "collapsed": true 313 | }, 314 | "outputs": [], 315 | "source": [ 316 | "my_bool = True\n", 317 | "print(my_bool, type(my_bool))" 318 | ] 319 | }, 320 | { 321 | "cell_type": "markdown", 322 | "metadata": {}, 323 | "source": [ 324 | "There are many more data types that you can assign as variables in Python, but these are the basic ones! We will cover a few more later as we move through this tutorial." 325 | ] 326 | }, 327 | { 328 | "cell_type": "markdown", 329 | "metadata": {}, 330 | "source": [ 331 | "在Python中有非常多的数据类型可供使用,这里的只是一些基础!随着课程的进行,我们将逐渐接触其他的数据类型." 332 | ] 333 | }, 334 | { 335 | "cell_type": "markdown", 336 | "metadata": {}, 337 | "source": [ 338 | "## Basic Math\n", 339 | "\n", 340 | "Python has a number of built-in math functions. These can be extended even further by importing the **math** package or by including any number of other calculation-based packages.\n", 341 | "\n", 342 | "All of the basic arithmetic operations are supported: `+`, `-`, `/`, and `*`. You can create exponents by using `**` and modular arithmetic is introduced with the mod operator, `%`." 343 | ] 344 | }, 345 | { 346 | "cell_type": "markdown", 347 | "metadata": {}, 348 | "source": [ 349 | "## 数学基础\n", 350 | "Python有内置的数值计算函数,通过导入`math`或者其他的计算包来丰富你的数值计算函数.\n", 351 | "\n", 352 | "所有基础的算术操作包括`+`, `-`, `/`, 和 `*`.你可以用`**`表示幂运算,`%`表示模运算." 353 | ] 354 | }, 355 | { 356 | "cell_type": "code", 357 | "execution_count": null, 358 | "metadata": { 359 | "collapsed": true 360 | }, 361 | "outputs": [], 362 | "source": [ 363 | "print('Addition: ', 2 + 2)\n", 364 | "print('Subtraction: ', 7 - 4)\n", 365 | "print('Multiplication: ', 2 * 5)\n", 366 | "print('Division: ', 10 / 2)\n", 367 | "print('Exponentiation: ', 3**2)" 368 | ] 369 | }, 370 | { 371 | "cell_type": "markdown", 372 | "metadata": {}, 373 | "source": [ 374 | "If you are not familiar with the the mod operator, it operates like a remainder function. If we type $15 \\ \\% \\ 4$, it will return the remainder after dividing $15$ by $4$." 375 | ] 376 | }, 377 | { 378 | "cell_type": "markdown", 379 | "metadata": {}, 380 | "source": [ 381 | "如果你对模运算不熟悉, 可以把他想象成一个求余数的函数. 当我们运行 $15 \\ \\% \\ 4$, 他会返回 $15$ 除以 $4$ 的余数." 382 | ] 383 | }, 384 | { 385 | "cell_type": "code", 386 | "execution_count": null, 387 | "metadata": { 388 | "collapsed": true 389 | }, 390 | "outputs": [], 391 | "source": [ 392 | "print('Modulo: ', 15 % 4)" 393 | ] 394 | }, 395 | { 396 | "cell_type": "markdown", 397 | "metadata": {}, 398 | "source": [ 399 | "Mathematical functions also work on variables!" 400 | ] 401 | }, 402 | { 403 | "cell_type": "markdown", 404 | "metadata": {}, 405 | "source": [ 406 | "数学函数同样能作用在变量上!" 407 | ] 408 | }, 409 | { 410 | "cell_type": "code", 411 | "execution_count": null, 412 | "metadata": { 413 | "collapsed": true, 414 | "scrolled": true 415 | }, 416 | "outputs": [], 417 | "source": [ 418 | "first_integer = 4\n", 419 | "second_integer = 5\n", 420 | "print(first_integer * second_integer)" 421 | ] 422 | }, 423 | { 424 | "cell_type": "markdown", 425 | "metadata": {}, 426 | "source": [ 427 | "Make sure that your variables are floats if you want to have decimal points in your answer. If you perform math exclusively with integers, you get an integer. Including any float in the calculation will make the result a float." 428 | ] 429 | }, 430 | { 431 | "cell_type": "markdown", 432 | "metadata": {}, 433 | "source": [ 434 | "- 如果你想要答案有小数点,请确保你的变量类型是浮点型.\n", 435 | "- 如果是用整数执行计算,得到的则会是整数.(**这个法则仅仅在python 2中生效!!**)\n", 436 | "- 如果计算中有一个是浮点数,那么结果也会是浮点数." 437 | ] 438 | }, 439 | { 440 | "cell_type": "code", 441 | "execution_count": null, 442 | "metadata": { 443 | "collapsed": true, 444 | "scrolled": true 445 | }, 446 | "outputs": [], 447 | "source": [ 448 | "first_integer = 11\n", 449 | "second_integer = 3\n", 450 | "\n", 451 | "# 当你的python的版本是2.\n", 452 | "# print first_integer / second_integer\n", 453 | "print(first_integer // second_integer)" 454 | ] 455 | }, 456 | { 457 | "cell_type": "code", 458 | "execution_count": null, 459 | "metadata": { 460 | "collapsed": true 461 | }, 462 | "outputs": [], 463 | "source": [ 464 | "first_number = 11.0\n", 465 | "second_number = 3.0\n", 466 | "print(first_number / second_number)" 467 | ] 468 | }, 469 | { 470 | "cell_type": "markdown", 471 | "metadata": {}, 472 | "source": [ 473 | "Python has a few built-in math functions. The most notable of these are:\n", 474 | "\n", 475 | "* `abs()`\n", 476 | "* `round()`\n", 477 | "* `max()`\n", 478 | "* `min()`\n", 479 | "* `sum()`\n", 480 | "\n", 481 | "These functions all act as you would expect, given their names. Calling `abs()` on a number will return its absolute value. The `round()` function will round a number to a specified number of the decimal points (the default is $0$). Calling `max()` or `min()` on a collection of numbers will return, respectively, the maximum or minimum value in the collection. Calling `sum()` on a collection of numbers will add them all up. If you're not familiar with how collections of values in Python work, don't worry! We will cover collections in-depth in the next section. \n", 482 | "\n", 483 | "Additional math functionality can be added in with the `math` package." 484 | ] 485 | }, 486 | { 487 | "cell_type": "markdown", 488 | "metadata": {}, 489 | "source": [ 490 | "Python有一些内置的数学函数,其中最为常见的:\n", 491 | "\n", 492 | "* `abs()`\n", 493 | "* `round()`\n", 494 | "* `max()`\n", 495 | "* `min()`\n", 496 | "* `sum()`\n", 497 | "\n", 498 | "这些函数作用就如同字面意思上. 调用 `abs()` 会返回一个数字绝对值. `round()`函数会将数字舍入到指定的小数点数 (默认值为 0). 对一个数字集合调用 `max()` 或者 `min()`将会返回集合中的最大值或者最小值. 对于一个数字集合调用 `sum()` 将返回其总和.如果你不太清楚Python中的集合工作原理,不用担心,我们会将在下一节深入探讨.\n", 499 | "\n", 500 | "使用`math`包来导入更多的数学函数." 501 | ] 502 | }, 503 | { 504 | "cell_type": "code", 505 | "execution_count": null, 506 | "metadata": { 507 | "collapsed": true 508 | }, 509 | "outputs": [], 510 | "source": [ 511 | "import math" 512 | ] 513 | }, 514 | { 515 | "cell_type": "markdown", 516 | "metadata": {}, 517 | "source": [ 518 | "The math library adds a long list of new mathematical functions to Python. Feel free to check out the [documentation](https://docs.python.org/2/library/math.html) for the full list and details. It concludes some mathematical constants" 519 | ] 520 | }, 521 | { 522 | "cell_type": "markdown", 523 | "metadata": {}, 524 | "source": [ 525 | "`math`库添加了很多新的数学函数,通过查看[documentation](https://docs.python.org/2/library/math.html),获取完整列表和详细信息,其中包含了数学常数." 526 | ] 527 | }, 528 | { 529 | "cell_type": "code", 530 | "execution_count": null, 531 | "metadata": { 532 | "collapsed": true 533 | }, 534 | "outputs": [], 535 | "source": [ 536 | "print('Pi: ', math.pi)\n", 537 | "print(\"Euler's Constant: \", math.e)" 538 | ] 539 | }, 540 | { 541 | "cell_type": "markdown", 542 | "metadata": {}, 543 | "source": [ 544 | "As well as some commonly used math functions" 545 | ] 546 | }, 547 | { 548 | "cell_type": "markdown", 549 | "metadata": {}, 550 | "source": [ 551 | "以及一些常用的数学函数" 552 | ] 553 | }, 554 | { 555 | "cell_type": "code", 556 | "execution_count": null, 557 | "metadata": { 558 | "collapsed": true 559 | }, 560 | "outputs": [], 561 | "source": [ 562 | "print('Cosine of pi: ', math.cos(math.pi))" 563 | ] 564 | }, 565 | { 566 | "cell_type": "markdown", 567 | "metadata": {}, 568 | "source": [ 569 | "## Collections\n", 570 | "### Lists\n", 571 | "\n", 572 | "A `list` in Python is an ordered collection of objects that can contain any data type. We define a `list` using brackets (`[]`)." 573 | ] 574 | }, 575 | { 576 | "cell_type": "markdown", 577 | "metadata": {}, 578 | "source": [ 579 | "## 集合\n", 580 | "### List (列表)\n", 581 | "\n", 582 | "Python中的`list`是可以包含任何数据类型的对象的有序集合.\n", 583 | "我们使用方括号(`[]`)定义一个列表." 584 | ] 585 | }, 586 | { 587 | "cell_type": "code", 588 | "execution_count": null, 589 | "metadata": { 590 | "collapsed": true 591 | }, 592 | "outputs": [], 593 | "source": [ 594 | "my_list = [1, 2, 3]\n", 595 | "print(my_list)" 596 | ] 597 | }, 598 | { 599 | "cell_type": "markdown", 600 | "metadata": {}, 601 | "source": [ 602 | "We can access and index the list by using brackets as well. In order to select an individual element, simply type the list name followed by the index of the item you are looking for in braces." 603 | ] 604 | }, 605 | { 606 | "cell_type": "markdown", 607 | "metadata": {}, 608 | "source": [ 609 | "我们可以通过方括号来访问和索引列表,为了选择一个单独的元素,只需输入列表名称,然后输入你要查找的括号中的项目的索引." 610 | ] 611 | }, 612 | { 613 | "cell_type": "code", 614 | "execution_count": null, 615 | "metadata": { 616 | "collapsed": true 617 | }, 618 | "outputs": [], 619 | "source": [ 620 | "print(my_list[0])\n", 621 | "print(my_list[2])" 622 | ] 623 | }, 624 | { 625 | "cell_type": "markdown", 626 | "metadata": {}, 627 | "source": [ 628 | "Indexing in Python starts from $0$. If you have a list of length $n$, the first element of the list is at index $0$, the second element is at index $1$, and so on and so forth. The final element of the list will be at index $n-1$. Be careful! Trying to access a non-existent index will cause an error." 629 | ] 630 | }, 631 | { 632 | "cell_type": "markdown", 633 | "metadata": {}, 634 | "source": [ 635 | "Python的索引起始值是$0$.如果你列表的长度是$n$, 第一个元素的索引号是$0$, 第二个元素的索引号是$1$,以此类推. 最后一个元素的索引是$n-1$. 一定要小心! 当你尝试访问一个不存在的索引,就会报错." 636 | ] 637 | }, 638 | { 639 | "cell_type": "code", 640 | "execution_count": null, 641 | "metadata": { 642 | "collapsed": true 643 | }, 644 | "outputs": [], 645 | "source": [ 646 | "print('The first, second, and third list elements: ', my_list[0], my_list[1], my_list[2])\n", 647 | "print('Accessing outside the list bounds causes an error: ', my_list[3])" 648 | ] 649 | }, 650 | { 651 | "cell_type": "markdown", 652 | "metadata": {}, 653 | "source": [ 654 | "We can see the number of elements in a list by calling the `len()` function." 655 | ] 656 | }, 657 | { 658 | "cell_type": "markdown", 659 | "metadata": {}, 660 | "source": [ 661 | "我们可以调用`len()`函数来获取list的元素个数(列表长度)." 662 | ] 663 | }, 664 | { 665 | "cell_type": "code", 666 | "execution_count": null, 667 | "metadata": { 668 | "collapsed": true 669 | }, 670 | "outputs": [], 671 | "source": [ 672 | "print(len(my_list))" 673 | ] 674 | }, 675 | { 676 | "cell_type": "markdown", 677 | "metadata": {}, 678 | "source": [ 679 | "We can update and change a list by accessing an index and assigning new value." 680 | ] 681 | }, 682 | { 683 | "cell_type": "markdown", 684 | "metadata": {}, 685 | "source": [ 686 | "我们可以通过访问索引并赋新值来更新和改变列表." 687 | ] 688 | }, 689 | { 690 | "cell_type": "code", 691 | "execution_count": null, 692 | "metadata": { 693 | "collapsed": true 694 | }, 695 | "outputs": [], 696 | "source": [ 697 | "print(my_list)\n", 698 | "my_list[0] = 42\n", 699 | "print(my_list)" 700 | ] 701 | }, 702 | { 703 | "cell_type": "markdown", 704 | "metadata": {}, 705 | "source": [ 706 | "This is fundamentally different from how strings are handled. A `list` is mutable, meaning that you can change a `list`'s elements without changing the list itself. Some data types, like `strings`, are immutable, meaning you cannot change them at all. Once a `string` or other immutable data type has been created, it cannot be directly modified without creating an entirely new object." 707 | ] 708 | }, 709 | { 710 | "cell_type": "markdown", 711 | "metadata": {}, 712 | "source": [ 713 | "列表和字符串的处理有根本的区别.\n", 714 | "`list`是可变的数据类型,这意味着你可以更改`list`的元素.\n", 715 | "一些数据类型(如 `strings`)是不可变的,你无法去修改他们.\n", 716 | "一旦创建了一个`string`或其他不可变数据类型,就不能直接修改它,除非创建一个全新的对象." 717 | ] 718 | }, 719 | { 720 | "cell_type": "code", 721 | "execution_count": null, 722 | "metadata": { 723 | "collapsed": true 724 | }, 725 | "outputs": [], 726 | "source": [ 727 | "my_string = \"Strings never change\"\n", 728 | "my_string[0] = 'Z'" 729 | ] 730 | }, 731 | { 732 | "cell_type": "markdown", 733 | "metadata": {}, 734 | "source": [ 735 | "As we stated before, a list can contain any data type. Thus, lists can also contain strings." 736 | ] 737 | }, 738 | { 739 | "cell_type": "markdown", 740 | "metadata": {}, 741 | "source": [ 742 | "正如前面所说,列表的元素可以是任何数据类型.因此,字符串也可以作为元素." 743 | ] 744 | }, 745 | { 746 | "cell_type": "code", 747 | "execution_count": null, 748 | "metadata": { 749 | "collapsed": true 750 | }, 751 | "outputs": [], 752 | "source": [ 753 | "my_list_2 = ['one', 'two', 'three']\n", 754 | "print(my_list_2)" 755 | ] 756 | }, 757 | { 758 | "cell_type": "markdown", 759 | "metadata": {}, 760 | "source": [ 761 | "Lists can also contain multiple different data types at once!" 762 | ] 763 | }, 764 | { 765 | "cell_type": "markdown", 766 | "metadata": {}, 767 | "source": [ 768 | "列表也能同时容纳不同数据类型的数据." 769 | ] 770 | }, 771 | { 772 | "cell_type": "code", 773 | "execution_count": null, 774 | "metadata": { 775 | "collapsed": true 776 | }, 777 | "outputs": [], 778 | "source": [ 779 | "my_list_3 = [True, 'False', 42]" 780 | ] 781 | }, 782 | { 783 | "cell_type": "markdown", 784 | "metadata": {}, 785 | "source": [ 786 | "If you want to put two lists together, they can be combined with a `+` symbol." 787 | ] 788 | }, 789 | { 790 | "cell_type": "markdown", 791 | "metadata": {}, 792 | "source": [ 793 | "你可以用一个`+`来连接两个列表." 794 | ] 795 | }, 796 | { 797 | "cell_type": "code", 798 | "execution_count": null, 799 | "metadata": { 800 | "collapsed": true 801 | }, 802 | "outputs": [], 803 | "source": [ 804 | "my_list_4 = my_list + my_list_2 + my_list_3\n", 805 | "print(my_list_4)" 806 | ] 807 | }, 808 | { 809 | "cell_type": "markdown", 810 | "metadata": {}, 811 | "source": [ 812 | "In addition to accessing individual elements of a list, we can access groups of elements through slicing." 813 | ] 814 | }, 815 | { 816 | "cell_type": "markdown", 817 | "metadata": {}, 818 | "source": [ 819 | "除了可以访问列表中单个元素之外,还可以通过切片(slice)的方式访问多个元素." 820 | ] 821 | }, 822 | { 823 | "cell_type": "code", 824 | "execution_count": null, 825 | "metadata": { 826 | "collapsed": true 827 | }, 828 | "outputs": [], 829 | "source": [ 830 | "my_list = ['friends', 'romans', 'countrymen', 'lend', 'me', 'your', 'ears']" 831 | ] 832 | }, 833 | { 834 | "cell_type": "markdown", 835 | "metadata": {}, 836 | "source": [ 837 | "#### Slicing\n", 838 | "\n", 839 | "We use the colon (`:`) to slice lists. " 840 | ] 841 | }, 842 | { 843 | "cell_type": "markdown", 844 | "metadata": {}, 845 | "source": [ 846 | "#### 切片\n", 847 | "\n", 848 | "我们使用分号 (`:`)对列表进行切片. " 849 | ] 850 | }, 851 | { 852 | "cell_type": "code", 853 | "execution_count": null, 854 | "metadata": { 855 | "collapsed": true, 856 | "scrolled": true 857 | }, 858 | "outputs": [], 859 | "source": [ 860 | "print(my_list[2:4])" 861 | ] 862 | }, 863 | { 864 | "cell_type": "markdown", 865 | "metadata": {}, 866 | "source": [ 867 | "Using `:` we can select a group of elements in the list starting from the first element indicated and going up to (but not including) the last element indicated.\n", 868 | "\n", 869 | "We can also select everything after a certain point" 870 | ] 871 | }, 872 | { 873 | "cell_type": "markdown", 874 | "metadata": {}, 875 | "source": [ 876 | "`:`左边是切片的起点索引,右边是切片终点索引,顾名思义,就是选取起点和终点之间的值,包含起点,单不包含终点.\n", 877 | "类似数学中开闭集合:[start, end)\n", 878 | "\n", 879 | "如果省略`:`左边的索引,则代表从第0个元素开始选取,类似的,省略右边的索引,则选取到列表最后一个元素." 880 | ] 881 | }, 882 | { 883 | "cell_type": "code", 884 | "execution_count": null, 885 | "metadata": { 886 | "collapsed": true, 887 | "scrolled": true 888 | }, 889 | "outputs": [], 890 | "source": [ 891 | "print(my_list[1:])" 892 | ] 893 | }, 894 | { 895 | "cell_type": "markdown", 896 | "metadata": {}, 897 | "source": [ 898 | "And everything before a certain point" 899 | ] 900 | }, 901 | { 902 | "cell_type": "markdown", 903 | "metadata": {}, 904 | "source": [ 905 | "某个元素之前所有的点" 906 | ] 907 | }, 908 | { 909 | "cell_type": "code", 910 | "execution_count": null, 911 | "metadata": { 912 | "collapsed": true, 913 | "scrolled": true 914 | }, 915 | "outputs": [], 916 | "source": [ 917 | "print(my_list[:4])" 918 | ] 919 | }, 920 | { 921 | "cell_type": "markdown", 922 | "metadata": {}, 923 | "source": [ 924 | "Using negative numbers will count from the end of the indices instead of from the beginning. For example, an index of `-1` indicates the last element of the list." 925 | ] 926 | }, 927 | { 928 | "cell_type": "markdown", 929 | "metadata": {}, 930 | "source": [ 931 | "使用负数索引将会从末尾倒序计数.例如,索引为`-1`表示列表最后一个元素." 932 | ] 933 | }, 934 | { 935 | "cell_type": "code", 936 | "execution_count": null, 937 | "metadata": { 938 | "collapsed": true 939 | }, 940 | "outputs": [], 941 | "source": [ 942 | "print(my_list[-1])" 943 | ] 944 | }, 945 | { 946 | "cell_type": "markdown", 947 | "metadata": {}, 948 | "source": [ 949 | "You can also add a third component to slicing. Instead of simply indicating the first and final parts of your slice, you can specify the step size that you want to take. So instead of taking every single element, you can take every other element." 950 | ] 951 | }, 952 | { 953 | "cell_type": "markdown", 954 | "metadata": {}, 955 | "source": [ 956 | "假设一个列表为a,a切片的一般形式为:\n", 957 | "`a[start : end : step]` \n", 958 | " start是切片起点索引,end是切片终点索引,但切片结果不包括终点索引的值。step是步长默认是1。" 959 | ] 960 | }, 961 | { 962 | "cell_type": "code", 963 | "execution_count": null, 964 | "metadata": { 965 | "collapsed": true 966 | }, 967 | "outputs": [], 968 | "source": [ 969 | "print(my_list[0:7:2])" 970 | ] 971 | }, 972 | { 973 | "cell_type": "markdown", 974 | "metadata": {}, 975 | "source": [ 976 | "Here we have selected the entire list (because `0:7` will yield elements `0` through `6`) and we have selected a step size of `2`. So this will spit out element `0` , element `2`, element `4`, and so on through the list element selected. We can skip indicated the beginning and end of our slice, only indicating the step, if we like." 977 | ] 978 | }, 979 | { 980 | "cell_type": "markdown", 981 | "metadata": {}, 982 | "source": [ 983 | "这里我们选择了整个列表(因为0:7包含了0到6所有的元素),并且我们指定单步长为2.所以这将通过列表元素返回元素0,元素2,元素4等等.如果我们愿意,我们可以忽略开始位置和结束位置,仅指示步长." 984 | ] 985 | }, 986 | { 987 | "cell_type": "code", 988 | "execution_count": null, 989 | "metadata": { 990 | "collapsed": true 991 | }, 992 | "outputs": [], 993 | "source": [ 994 | "print(my_list[::2])" 995 | ] 996 | }, 997 | { 998 | "cell_type": "markdown", 999 | "metadata": {}, 1000 | "source": [ 1001 | "Lists implicitly select the beginning and end of the list when not otherwise specified." 1002 | ] 1003 | }, 1004 | { 1005 | "cell_type": "markdown", 1006 | "metadata": {}, 1007 | "source": [ 1008 | "没有特别说明的话,列表的切片是默认指定开头和结束位置的." 1009 | ] 1010 | }, 1011 | { 1012 | "cell_type": "code", 1013 | "execution_count": null, 1014 | "metadata": { 1015 | "collapsed": true 1016 | }, 1017 | "outputs": [], 1018 | "source": [ 1019 | "print(my_list[:])" 1020 | ] 1021 | }, 1022 | { 1023 | "cell_type": "markdown", 1024 | "metadata": {}, 1025 | "source": [ 1026 | "With a negative step size we can even reverse the list!" 1027 | ] 1028 | }, 1029 | { 1030 | "cell_type": "markdown", 1031 | "metadata": {}, 1032 | "source": [ 1033 | "使用一个负数作为步长,我们可以对列表进行逆向排序!" 1034 | ] 1035 | }, 1036 | { 1037 | "cell_type": "code", 1038 | "execution_count": null, 1039 | "metadata": { 1040 | "collapsed": true 1041 | }, 1042 | "outputs": [], 1043 | "source": [ 1044 | "print(my_list[::-1])" 1045 | ] 1046 | }, 1047 | { 1048 | "cell_type": "markdown", 1049 | "metadata": {}, 1050 | "source": [ 1051 | "Python does not have native matrices, but with lists we can produce a working fascimile. Other packages, such as `numpy`, add matrices as a separate data type, but in base Python the best way to create a matrix is to use a list of lists." 1052 | ] 1053 | }, 1054 | { 1055 | "cell_type": "markdown", 1056 | "metadata": {}, 1057 | "source": [ 1058 | "Python没有原生的矩阵数据类型,但是我们使用列表可以达到类似的效果.另外一些包,比如`numpy`,把矩阵添加为单独的数据类型,但是在Python基础应用中,使用列表嵌套的方式创建矩阵是一个不错的选择.\n", 1059 | "\n", 1060 | "- 译者の疑问:fascimile该如何翻译? " 1061 | ] 1062 | }, 1063 | { 1064 | "cell_type": "markdown", 1065 | "metadata": {}, 1066 | "source": [ 1067 | "We can also use built-in functions to generate lists. In particular we will look at `range()` (because we will be using it later!). Range can take several different inputs and will return a list." 1068 | ] 1069 | }, 1070 | { 1071 | "cell_type": "markdown", 1072 | "metadata": {}, 1073 | "source": [ 1074 | "我们也能通过使用内置函数来生成列表.我们会着重来看一下`range()`.`range`能采取几种不同的参数输入方式,并且返回一个列表(注意其返回的数据类型并不是`list`)." 1075 | ] 1076 | }, 1077 | { 1078 | "cell_type": "code", 1079 | "execution_count": null, 1080 | "metadata": { 1081 | "collapsed": true 1082 | }, 1083 | "outputs": [], 1084 | "source": [ 1085 | "b = 10\n", 1086 | "my_list = range(b)\n", 1087 | "print(my_list)\n", 1088 | "print(type(my_list))" 1089 | ] 1090 | }, 1091 | { 1092 | "cell_type": "markdown", 1093 | "metadata": {}, 1094 | "source": [ 1095 | "Similar to our list-slicing methods from before, we can define both a start and an end for our range. This will return a list that is includes the start and excludes the end, just like a slice." 1096 | ] 1097 | }, 1098 | { 1099 | "cell_type": "markdown", 1100 | "metadata": {}, 1101 | "source": [ 1102 | "类似列表切片方法,对于`range`我们也可以指定开始和结束位置,返回一个列表,并且包含开头但不包含结尾,这一点也和切片的原理类似." 1103 | ] 1104 | }, 1105 | { 1106 | "cell_type": "code", 1107 | "execution_count": null, 1108 | "metadata": { 1109 | "collapsed": true 1110 | }, 1111 | "outputs": [], 1112 | "source": [ 1113 | "a = 0\n", 1114 | "b = 10\n", 1115 | "my_list = range(a, b)\n", 1116 | "print(my_list)" 1117 | ] 1118 | }, 1119 | { 1120 | "cell_type": "markdown", 1121 | "metadata": {}, 1122 | "source": [ 1123 | "We can also specify a step size. This again has the same behavior as a slice." 1124 | ] 1125 | }, 1126 | { 1127 | "cell_type": "markdown", 1128 | "metadata": {}, 1129 | "source": [ 1130 | "类似切片,我们一样可以设定一个步长." 1131 | ] 1132 | }, 1133 | { 1134 | "cell_type": "code", 1135 | "execution_count": null, 1136 | "metadata": { 1137 | "collapsed": true, 1138 | "scrolled": true 1139 | }, 1140 | "outputs": [], 1141 | "source": [ 1142 | "a = 0\n", 1143 | "b = 10\n", 1144 | "step = 2\n", 1145 | "my_list = range(a, b, step)\n", 1146 | "print(my_list)" 1147 | ] 1148 | }, 1149 | { 1150 | "cell_type": "markdown", 1151 | "metadata": {}, 1152 | "source": [ 1153 | "### Tuples\n", 1154 | "\n", 1155 | "A `tuple` is a data type similar to a list in that it can hold different kinds of data types. The key difference here is that a `tuple` is immutable. We define a `tuple` by separating the elements we want to include by commas. It is conventional to surround a `tuple` with parentheses." 1156 | ] 1157 | }, 1158 | { 1159 | "cell_type": "markdown", 1160 | "metadata": {}, 1161 | "source": [ 1162 | "### Tuples(元组)\n", 1163 | "`tuple`是类似于列表的数据类型,它可以容纳不同类型的数据类型.\n", 1164 | "这里最关键不同点是`tuple`是不可变的.\n", 1165 | "我们通过用逗号分隔我们要包括的元素来定义一个`tuple`.\n", 1166 | "通常用圆括号来表示一个`tuple`.\n", 1167 | "\n", 1168 | "注意:当你定义一个只有一个元素的`tuple`时,需要带上`','`,请感受下`(1)`和`(1,)`,计算机会把哪一个当做`tuple`?" 1169 | ] 1170 | }, 1171 | { 1172 | "cell_type": "code", 1173 | "execution_count": null, 1174 | "metadata": { 1175 | "collapsed": true 1176 | }, 1177 | "outputs": [], 1178 | "source": [ 1179 | "my_tuple = 'I', 'have', 30, 'cats'\n", 1180 | "print(my_tuple)" 1181 | ] 1182 | }, 1183 | { 1184 | "cell_type": "markdown", 1185 | "metadata": {}, 1186 | "source": [ 1187 | "As mentioned before, tuples are immutable. You can't change any part of them without defining a new tuple." 1188 | ] 1189 | }, 1190 | { 1191 | "cell_type": "markdown", 1192 | "metadata": {}, 1193 | "source": [ 1194 | "如前所述,元组是不可变的.\n", 1195 | "在不定义新元组的情况下,你不能修改他的元素." 1196 | ] 1197 | }, 1198 | { 1199 | "cell_type": "code", 1200 | "execution_count": null, 1201 | "metadata": { 1202 | "collapsed": true 1203 | }, 1204 | "outputs": [], 1205 | "source": [ 1206 | "# 强行将喵星人换成doge,是会遭天谴的!!!不信抬头看,苍天饶过谁!!!\n", 1207 | "my_tuple[3] = 'dogs' # Attempts to change the 'cats' value stored in the the tuple to 'dogs'" 1208 | ] 1209 | }, 1210 | { 1211 | "cell_type": "markdown", 1212 | "metadata": {}, 1213 | "source": [ 1214 | "You can slice tuples the same way that you slice lists!" 1215 | ] 1216 | }, 1217 | { 1218 | "cell_type": "markdown", 1219 | "metadata": {}, 1220 | "source": [ 1221 | "同样的你也可以像list那样使用切片!" 1222 | ] 1223 | }, 1224 | { 1225 | "cell_type": "code", 1226 | "execution_count": null, 1227 | "metadata": { 1228 | "collapsed": true 1229 | }, 1230 | "outputs": [], 1231 | "source": [ 1232 | "print(my_tuple[1:3])" 1233 | ] 1234 | }, 1235 | { 1236 | "cell_type": "markdown", 1237 | "metadata": {}, 1238 | "source": [ 1239 | "And concatenate them the way that you would with strings!" 1240 | ] 1241 | }, 1242 | { 1243 | "cell_type": "markdown", 1244 | "metadata": {}, 1245 | "source": [ 1246 | "并且能像`strings`那样把他们连接起来!" 1247 | ] 1248 | }, 1249 | { 1250 | "cell_type": "code", 1251 | "execution_count": null, 1252 | "metadata": { 1253 | "collapsed": true 1254 | }, 1255 | "outputs": [], 1256 | "source": [ 1257 | "my_other_tuple = ('make', 'that', 50)\n", 1258 | "print(my_tuple + my_other_tuple)" 1259 | ] 1260 | }, 1261 | { 1262 | "cell_type": "markdown", 1263 | "metadata": {}, 1264 | "source": [ 1265 | "We can 'pack' values together, creating a tuple (as above), or we can 'unpack' values from a tuple, taking them out." 1266 | ] 1267 | }, 1268 | { 1269 | "cell_type": "markdown", 1270 | "metadata": {}, 1271 | "source": [ 1272 | "我们可以将一些值'压缩'在一起组成一个元组,我们也能通过'解压缩'的方式,把值从元组里取出来." 1273 | ] 1274 | }, 1275 | { 1276 | "cell_type": "code", 1277 | "execution_count": null, 1278 | "metadata": { 1279 | "collapsed": true 1280 | }, 1281 | "outputs": [], 1282 | "source": [ 1283 | "str_1, str_2, int_1 = my_other_tuple\n", 1284 | "print(str_1, str_2, int_1)" 1285 | ] 1286 | }, 1287 | { 1288 | "cell_type": "markdown", 1289 | "metadata": {}, 1290 | "source": [ 1291 | "Unpacking assigns each value of the tuple in order to each variable on the left hand side of the equals sign. Some functions, including user-defined functions, may return tuples, so we can use this to directly unpack them and access the values that we want." 1292 | ] 1293 | }, 1294 | { 1295 | "cell_type": "markdown", 1296 | "metadata": {}, 1297 | "source": [ 1298 | "'解压缩'赋值是将元组里的元素按照顺序依次传递给等号左边的变量.有一些函数,包括用户自定义的函数,都可以返回元组,因此我们可以用这种'解压缩'的方式来获取我们想要的值." 1299 | ] 1300 | }, 1301 | { 1302 | "cell_type": "markdown", 1303 | "metadata": {}, 1304 | "source": [ 1305 | "### Sets\n", 1306 | "\n", 1307 | "A `set` is a collection of unordered, unique elements. It works almost exactly as you would expect a normal set of things in mathematics to work and is defined using braces (`{}`)." 1308 | ] 1309 | }, 1310 | { 1311 | "cell_type": "markdown", 1312 | "metadata": {}, 1313 | "source": [ 1314 | "### Sets(集合)\n", 1315 | "\n", 1316 | "`set`是无序且元素唯一的集合.\n", 1317 | "它几乎完全和数学中定义一样,并使用花括号(`{}`)来表示." 1318 | ] 1319 | }, 1320 | { 1321 | "cell_type": "code", 1322 | "execution_count": null, 1323 | "metadata": { 1324 | "collapsed": true 1325 | }, 1326 | "outputs": [], 1327 | "source": [ 1328 | "things_i_like = {'dogs', 7, 'the number 4', 4, 4, 4, 42, 'lizards', 'man I just LOVE the number 4'}\n", 1329 | "print(things_i_like, type(things_i_like))" 1330 | ] 1331 | }, 1332 | { 1333 | "cell_type": "markdown", 1334 | "metadata": {}, 1335 | "source": [ 1336 | "Note how any extra instances of the same item are removed in the final set. We can also create a `set` from a list, using the `set()` function." 1337 | ] 1338 | }, 1339 | { 1340 | "cell_type": "markdown", 1341 | "metadata": {}, 1342 | "source": [ 1343 | "我们可以使用`set()`函数,来剔除掉list中的重复元素并且返回一个`set`数据类型." 1344 | ] 1345 | }, 1346 | { 1347 | "cell_type": "code", 1348 | "execution_count": null, 1349 | "metadata": { 1350 | "collapsed": true 1351 | }, 1352 | "outputs": [], 1353 | "source": [ 1354 | "animal_list = ['cats', 'dogs', 'dogs', 'dogs', 'lizards', 'sponges', 'cows', 'bats', 'sponges']\n", 1355 | "animal_set = set(animal_list)\n", 1356 | "print(animal_set) # Removes all extra instances from the list 从list中删除重复的元素" 1357 | ] 1358 | }, 1359 | { 1360 | "cell_type": "markdown", 1361 | "metadata": {}, 1362 | "source": [ 1363 | "Calling `len()` on a set will tell you how many elements are in it." 1364 | ] 1365 | }, 1366 | { 1367 | "cell_type": "markdown", 1368 | "metadata": {}, 1369 | "source": [ 1370 | "同样的使用`len()`函数可以告诉我们有多少个元素在set中." 1371 | ] 1372 | }, 1373 | { 1374 | "cell_type": "code", 1375 | "execution_count": null, 1376 | "metadata": { 1377 | "collapsed": true 1378 | }, 1379 | "outputs": [], 1380 | "source": [ 1381 | "print(len(animal_set))" 1382 | ] 1383 | }, 1384 | { 1385 | "cell_type": "markdown", 1386 | "metadata": {}, 1387 | "source": [ 1388 | "Because a `set` is unordered, we can't access individual elements using an index. We can, however, easily check for membership (to see if something is contained in a set) and take the unions and intersections of sets by using the built-in set functions." 1389 | ] 1390 | }, 1391 | { 1392 | "cell_type": "markdown", 1393 | "metadata": {}, 1394 | "source": [ 1395 | "由于`set`是无序的,我们不能通过索引来访问某个单独的元素.但是我们能简单验证他的元素构成(去判断某个元素是否在其中)并使用内置方法来求集合的交集和并集." 1396 | ] 1397 | }, 1398 | { 1399 | "cell_type": "code", 1400 | "execution_count": null, 1401 | "metadata": { 1402 | "collapsed": true 1403 | }, 1404 | "outputs": [], 1405 | "source": [ 1406 | "'cats' in animal_set # Here we check for membership using the `in` keyword." 1407 | ] 1408 | }, 1409 | { 1410 | "cell_type": "markdown", 1411 | "metadata": {}, 1412 | "source": [ 1413 | "Here we checked to see whether the string 'cats' was contained within our `animal_set` and it returned `True`, telling us that it is indeed in our set.\n", 1414 | "\n", 1415 | "We can connect sets by using typical mathematical set operators, namely `|`, for union, and `&`, for intersection. Using `|` or `&` will return exactly what you would expect if you are familiar with sets in mathematics." 1416 | ] 1417 | }, 1418 | { 1419 | "cell_type": "markdown", 1420 | "metadata": {}, 1421 | "source": [ 1422 | "在这里,我们检查了'cats'字符串是否包含在我们的`animal_set`中,并返回`True`,告诉我们它的确在我们的集合中.我们可以使用经典的数学集合运算符来进行集合操作,`|`,表示取并集操作,`&`表示取交集操作." 1423 | ] 1424 | }, 1425 | { 1426 | "cell_type": "code", 1427 | "execution_count": null, 1428 | "metadata": { 1429 | "collapsed": true 1430 | }, 1431 | "outputs": [], 1432 | "source": [ 1433 | "print(animal_set | things_i_like) # 交换两者的位置 things_i_like | animal_set 并没有什么区别" 1434 | ] 1435 | }, 1436 | { 1437 | "cell_type": "markdown", 1438 | "metadata": {}, 1439 | "source": [ 1440 | "Pairing two sets together with `|` combines the sets, removing any repetitions to make every set element unique." 1441 | ] 1442 | }, 1443 | { 1444 | "cell_type": "markdown", 1445 | "metadata": {}, 1446 | "source": [ 1447 | "使用`|`运算符可以求得两个集合的并集,并且移除重复的元素,使其唯一." 1448 | ] 1449 | }, 1450 | { 1451 | "cell_type": "code", 1452 | "execution_count": null, 1453 | "metadata": { 1454 | "collapsed": true 1455 | }, 1456 | "outputs": [], 1457 | "source": [ 1458 | "print(animal_set & things_i_like) # You can also write things_i_like & animal_set with no difference" 1459 | ] 1460 | }, 1461 | { 1462 | "cell_type": "markdown", 1463 | "metadata": {}, 1464 | "source": [ 1465 | "Pairing two sets together with `&` will calculate the intersection of both sets, returning a set that only contains what they have in common.\n", 1466 | "\n", 1467 | "If you are interested in learning more about the built-in functions for sets, feel free to check out the [documentation](https://docs.python.org/2/library/sets.html)." 1468 | ] 1469 | }, 1470 | { 1471 | "cell_type": "markdown", 1472 | "metadata": {}, 1473 | "source": [ 1474 | "使用 `&` 运算符,可以取得两个集合的交集:返回两个集合中共有的元素.\n", 1475 | "\n", 1476 | "如果你对sets的内置函数有兴趣, 可以 [点我查看文档](https://docs.python.org/2/library/sets.html)." 1477 | ] 1478 | }, 1479 | { 1480 | "cell_type": "markdown", 1481 | "metadata": {}, 1482 | "source": [ 1483 | "### Dictionaries\n", 1484 | "\n", 1485 | "Another essential data structure in Python is the dictionary. Dictionaries are defined with a combination of curly braces (`{}`) and colons (`:`). The braces define the beginning and end of a dictionary and the colons indicate key-value pairs. A dictionary is essentially a set of key-value pairs. The key of any entry must be an immutable data type. This makes both strings and tuples candidates. Keys can be both added and deleted.\n", 1486 | "\n", 1487 | "In the following example, we have a dictionary composed of key-value pairs where the key is a genre of fiction (`string`) and the value is a list of books (`list`) within that genre. Since a collection is still considered a single entity, we can use one to collect multiple variables or values into one key-value pair." 1488 | ] 1489 | }, 1490 | { 1491 | "cell_type": "markdown", 1492 | "metadata": {}, 1493 | "source": [ 1494 | "### 字典\n", 1495 | "\n", 1496 | "字典,另一个在python十分有用的数据结构.字典使用花括号(`{}`)和分号(`:`)来表示.括号定义了一个字典起始和结束,分号用于表示键-值对(key-value pair). 字典其实是键值对的集合(`set`).`key`可以是任何不可变的数据类型,`strings`和`tuples`都可以作为`key`.`key`是可以添加和删除的.\n", 1497 | "\n", 1498 | "在下面的示例中,我们有一个由键值对组成的字典,其中小说类型(`string`)是`key`,该类型中的书籍列表(`list`)是`value`.可以把书籍列表视为单个实体,我们就可以把小说和书籍列表构成一个键值对." 1499 | ] 1500 | }, 1501 | { 1502 | "cell_type": "code", 1503 | "execution_count": null, 1504 | "metadata": { 1505 | "collapsed": true 1506 | }, 1507 | "outputs": [], 1508 | "source": [ 1509 | "my_dict = {\"High Fantasy\": [\"Wheel of Time\", \"Lord of the Rings\"], \n", 1510 | " \"Sci-fi\": [\"Book of the New Sun\", \"Neuromancer\", \"Snow Crash\"],\n", 1511 | " \"Weird Fiction\": [\"At the Mountains of Madness\", \"The House on the Borderland\"]}" 1512 | ] 1513 | }, 1514 | { 1515 | "cell_type": "markdown", 1516 | "metadata": {}, 1517 | "source": [ 1518 | "After defining a dictionary, we can access any individual value by indicating its key in brackets." 1519 | ] 1520 | }, 1521 | { 1522 | "cell_type": "markdown", 1523 | "metadata": {}, 1524 | "source": [ 1525 | "定义一个字典后,我们能通过方括号使用`key`作为索引,来访问对应的`value`." 1526 | ] 1527 | }, 1528 | { 1529 | "cell_type": "code", 1530 | "execution_count": null, 1531 | "metadata": { 1532 | "collapsed": true 1533 | }, 1534 | "outputs": [], 1535 | "source": [ 1536 | "print(my_dict[\"Sci-fi\"])" 1537 | ] 1538 | }, 1539 | { 1540 | "cell_type": "markdown", 1541 | "metadata": {}, 1542 | "source": [ 1543 | "We can also change the value associated with a given key" 1544 | ] 1545 | }, 1546 | { 1547 | "cell_type": "markdown", 1548 | "metadata": {}, 1549 | "source": [ 1550 | "我们也能通过给定`key`来改变`value`." 1551 | ] 1552 | }, 1553 | { 1554 | "cell_type": "code", 1555 | "execution_count": null, 1556 | "metadata": { 1557 | "collapsed": true 1558 | }, 1559 | "outputs": [], 1560 | "source": [ 1561 | "my_dict[\"Sci-fi\"] = \"I can't read\"\n", 1562 | "print(my_dict[\"Sci-fi\"])" 1563 | ] 1564 | }, 1565 | { 1566 | "cell_type": "markdown", 1567 | "metadata": {}, 1568 | "source": [ 1569 | "Adding a new key-value pair is as simple as defining it." 1570 | ] 1571 | }, 1572 | { 1573 | "cell_type": "markdown", 1574 | "metadata": {}, 1575 | "source": [ 1576 | "可以很轻松的添加键值对" 1577 | ] 1578 | }, 1579 | { 1580 | "cell_type": "code", 1581 | "execution_count": null, 1582 | "metadata": { 1583 | "collapsed": true 1584 | }, 1585 | "outputs": [], 1586 | "source": [ 1587 | "my_dict[\"Historical Fiction\"] = [\"Pillars of the Earth\"]\n", 1588 | "print(my_dict[\"Historical Fiction\"])" 1589 | ] 1590 | }, 1591 | { 1592 | "cell_type": "code", 1593 | "execution_count": null, 1594 | "metadata": { 1595 | "collapsed": true 1596 | }, 1597 | "outputs": [], 1598 | "source": [ 1599 | "print(my_dict)" 1600 | ] 1601 | }, 1602 | { 1603 | "cell_type": "markdown", 1604 | "metadata": {}, 1605 | "source": [ 1606 | "## String Shenanigans\n", 1607 | "\n", 1608 | "We already know that strings are generally used for text. We can used built-in operations to combine, split, and format strings easily, depending on our needs.\n", 1609 | "\n", 1610 | "The `+` symbol indicates concatenation in string language. It will combine two strings into a longer string." 1611 | ] 1612 | }, 1613 | { 1614 | "cell_type": "markdown", 1615 | "metadata": {}, 1616 | "source": [ 1617 | "## 字符串的小技巧\n", 1618 | "\n", 1619 | "我们已经知道字符串通常用于文本.根据我们的需要,我们可以使用内置操作轻松组合,拆分和格式化字符串. \n", 1620 | "\n", 1621 | "`+` 表示字符串语言的连接.它将两个字符串组合成一个更长的字符串." 1622 | ] 1623 | }, 1624 | { 1625 | "cell_type": "code", 1626 | "execution_count": null, 1627 | "metadata": { 1628 | "collapsed": true 1629 | }, 1630 | "outputs": [], 1631 | "source": [ 1632 | "first_string = '\"Beware the Jabberwock, my son! /The jaws that bite, the claws that catch! /'\n", 1633 | "second_string = 'Beware the Jubjub bird, and shun /The frumious Bandersnatch!\"/'\n", 1634 | "third_string = first_string + second_string\n", 1635 | "print(third_string)" 1636 | ] 1637 | }, 1638 | { 1639 | "cell_type": "markdown", 1640 | "metadata": {}, 1641 | "source": [ 1642 | "Strings are also indexed much in the same way that lists are." 1643 | ] 1644 | }, 1645 | { 1646 | "cell_type": "markdown", 1647 | "metadata": {}, 1648 | "source": [ 1649 | "字符串能够像列表那样进行索引." 1650 | ] 1651 | }, 1652 | { 1653 | "cell_type": "code", 1654 | "execution_count": null, 1655 | "metadata": { 1656 | "collapsed": true 1657 | }, 1658 | "outputs": [], 1659 | "source": [ 1660 | "my_string = 'Supercalifragilisticexpialidocious'\n", 1661 | "print('The first letter is: ', my_string[0]) # Uppercase S\n", 1662 | "print('The last letter is: ', my_string[-1]) # lowercase s\n", 1663 | "print('The second to last letter is: ', my_string[-2]) # lowercase u\n", 1664 | "print('The first five characters are: ', my_string[0:5]) # Remember: slicing doesn't include the final element!\n", 1665 | "print('Reverse it!: ', my_string[::-1])" 1666 | ] 1667 | }, 1668 | { 1669 | "cell_type": "markdown", 1670 | "metadata": {}, 1671 | "source": [ 1672 | "Built-in objects and classes often have special functions associated with them that are called methods. We access these methods by using a period ('.'). We will cover objects and their associated methods more in another lecture!\n", 1673 | "\n", 1674 | "Using string methods we can count instances of a character or group of characters." 1675 | ] 1676 | }, 1677 | { 1678 | "cell_type": "markdown", 1679 | "metadata": {}, 1680 | "source": [ 1681 | "内置对象和类通常具有与它们相关联的特殊功能,称为方法.\n", 1682 | "我们通过使用句号('.')调用这些方法.我们将在另一节课中介绍对象及其相关的使用方法!\n", 1683 | "\n", 1684 | "使用字符串方法:`.count`能够对一个或多个字符进行计数." 1685 | ] 1686 | }, 1687 | { 1688 | "cell_type": "code", 1689 | "execution_count": null, 1690 | "metadata": { 1691 | "collapsed": true, 1692 | "scrolled": true 1693 | }, 1694 | "outputs": [], 1695 | "source": [ 1696 | "print('Count of the letter i in Supercalifragilisticexpialidocious: ', my_string.count('i'))\n", 1697 | "print('Count of \"li\" in the same word: ', my_string.count('li'))" 1698 | ] 1699 | }, 1700 | { 1701 | "cell_type": "markdown", 1702 | "metadata": {}, 1703 | "source": [ 1704 | "We can also find the first instance of a character or group of characters in a string." 1705 | ] 1706 | }, 1707 | { 1708 | "cell_type": "markdown", 1709 | "metadata": {}, 1710 | "source": [ 1711 | "我们也能通过`.find`找到字符在字符串中第一次出现的位置." 1712 | ] 1713 | }, 1714 | { 1715 | "cell_type": "code", 1716 | "execution_count": null, 1717 | "metadata": { 1718 | "collapsed": true 1719 | }, 1720 | "outputs": [], 1721 | "source": [ 1722 | "print('The first time i appears is at index: ', my_string.find('i'))" 1723 | ] 1724 | }, 1725 | { 1726 | "cell_type": "markdown", 1727 | "metadata": {}, 1728 | "source": [ 1729 | "As well as replace characters in a string." 1730 | ] 1731 | }, 1732 | { 1733 | "cell_type": "markdown", 1734 | "metadata": {}, 1735 | "source": [ 1736 | "以及字符串的替换:`.replace`" 1737 | ] 1738 | }, 1739 | { 1740 | "cell_type": "code", 1741 | "execution_count": null, 1742 | "metadata": { 1743 | "collapsed": true 1744 | }, 1745 | "outputs": [], 1746 | "source": [ 1747 | "print(\"All i's are now a's: \", my_string.replace('i', 'a'))" 1748 | ] 1749 | }, 1750 | { 1751 | "cell_type": "code", 1752 | "execution_count": null, 1753 | "metadata": { 1754 | "collapsed": true 1755 | }, 1756 | "outputs": [], 1757 | "source": [ 1758 | "print(\"It's raining cats and dogs\".replace('dogs', 'more cats'))" 1759 | ] 1760 | }, 1761 | { 1762 | "cell_type": "markdown", 1763 | "metadata": {}, 1764 | "source": [ 1765 | "There are also some methods that are unique to strings. The function `upper()` will convert all characters in a string to uppercase, while `lower()` will convert all characters in a string to lowercase!" 1766 | ] 1767 | }, 1768 | { 1769 | "cell_type": "markdown", 1770 | "metadata": {}, 1771 | "source": [ 1772 | "这里也有字符串特有的方法.比如函数`upper()`,将所有的字符转化为大写;类似的, `lower()` 讲字符转化成小写" 1773 | ] 1774 | }, 1775 | { 1776 | "cell_type": "code", 1777 | "execution_count": null, 1778 | "metadata": { 1779 | "collapsed": true 1780 | }, 1781 | "outputs": [], 1782 | "source": [ 1783 | "my_string = \"I can't hear you\"\n", 1784 | "print(my_string.upper())\n", 1785 | "my_string = \"I said HELLO\"\n", 1786 | "print(my_string.lower())" 1787 | ] 1788 | }, 1789 | { 1790 | "cell_type": "markdown", 1791 | "metadata": {}, 1792 | "source": [ 1793 | "### String Formatting\n", 1794 | "\n", 1795 | "Using the `format()` method we can add in variable values and generally format our strings." 1796 | ] 1797 | }, 1798 | { 1799 | "cell_type": "markdown", 1800 | "metadata": {}, 1801 | "source": [ 1802 | "### 字符串格式化输出\n", 1803 | "\n", 1804 | "用`format`方法,我们能变量的值进行格式化后输出成文本." 1805 | ] 1806 | }, 1807 | { 1808 | "cell_type": "code", 1809 | "execution_count": null, 1810 | "metadata": { 1811 | "collapsed": true 1812 | }, 1813 | "outputs": [], 1814 | "source": [ 1815 | "my_string = \"{0} {1}\".format('Marco', 'Polo')\n", 1816 | "print(my_string)" 1817 | ] 1818 | }, 1819 | { 1820 | "cell_type": "code", 1821 | "execution_count": null, 1822 | "metadata": { 1823 | "collapsed": true 1824 | }, 1825 | "outputs": [], 1826 | "source": [ 1827 | "my_string = \"{1} {0}\".format('Marco', 'Polo')\n", 1828 | "print(my_string)" 1829 | ] 1830 | }, 1831 | { 1832 | "cell_type": "markdown", 1833 | "metadata": {}, 1834 | "source": [ 1835 | "We use braces (`{}`) to indicate parts of the string that will be filled in later and we use the arguments of the `format()` function to provide the values to substitute. The numbers within the braces indicate the index of the value in the `format()` arguments." 1836 | ] 1837 | }, 1838 | { 1839 | "cell_type": "markdown", 1840 | "metadata": {}, 1841 | "source": [ 1842 | "我们使用花括号(`{}`)来表示稍后将被填充的字符串的部分,我们使用`format()`函数的参数来提供替换的值.\n", 1843 | "大括号中的数字表示`format()`参数中的值的索引." 1844 | ] 1845 | }, 1846 | { 1847 | "cell_type": "markdown", 1848 | "metadata": {}, 1849 | "source": [ 1850 | "See the `format()` [documentation](https://docs.python.org/2/library/string.html#format-examples) for additional examples." 1851 | ] 1852 | }, 1853 | { 1854 | "cell_type": "markdown", 1855 | "metadata": {}, 1856 | "source": [ 1857 | "可以参考 `format()` [文档](https://docs.python.org/2/library/string.html#format-examples) 查看更多示例." 1858 | ] 1859 | }, 1860 | { 1861 | "cell_type": "markdown", 1862 | "metadata": {}, 1863 | "source": [ 1864 | "If you need some quick and dirty formatting, you can instead use the `%` symbol, called the string formatting operator. " 1865 | ] 1866 | }, 1867 | { 1868 | "cell_type": "markdown", 1869 | "metadata": {}, 1870 | "source": [ 1871 | "如果你需要一种快餐化的格式化输出,你可以用`%`来进行操作,这被称为字符串格式化运算符." 1872 | ] 1873 | }, 1874 | { 1875 | "cell_type": "code", 1876 | "execution_count": null, 1877 | "metadata": { 1878 | "collapsed": true 1879 | }, 1880 | "outputs": [], 1881 | "source": [ 1882 | "print('insert %s here' % 'value')" 1883 | ] 1884 | }, 1885 | { 1886 | "cell_type": "markdown", 1887 | "metadata": {}, 1888 | "source": [ 1889 | "The `%` symbol basically cues Python to create a placeholder. Whatever character follows the `%` (in the string) indicates what sort of type the value put into the placeholder will have. This character is called a *conversion type*. Once the string has been closed, we need another `%` that will be followed by the values to insert. In the case of one value, you can just put it there. If you are inserting more than one value, they must be enclosed in a tuple." 1890 | ] 1891 | }, 1892 | { 1893 | "cell_type": "markdown", 1894 | "metadata": {}, 1895 | "source": [ 1896 | "`%`符号告诉Python创建一个占位符. `%`(在字符串中)后面的任何字符表示占位符所输入的值的类型.这个字符称为*conversion type*(*转换类型*).一旦字符串被关闭,我们需要另外一个`%`,后面跟随有值插入.在一个值的情况下,你可以把它放在那里.如果要插入多个值,则必须将置于元组中." 1897 | ] 1898 | }, 1899 | { 1900 | "cell_type": "code", 1901 | "execution_count": null, 1902 | "metadata": { 1903 | "collapsed": true 1904 | }, 1905 | "outputs": [], 1906 | "source": [ 1907 | "print('There are %s cats in my %s' % (13, 'apartment'))" 1908 | ] 1909 | }, 1910 | { 1911 | "cell_type": "markdown", 1912 | "metadata": {}, 1913 | "source": [ 1914 | "In these examples, the `%s` indicates that Python should convert the values into strings. There are multiple conversion types that you can use to get more specific with the the formatting. See the string formatting [documentation](https://docs.python.org/2/library/stdtypes.html#string-formatting) for additional examples and more complete details on use." 1915 | ] 1916 | }, 1917 | { 1918 | "cell_type": "markdown", 1919 | "metadata": {}, 1920 | "source": [ 1921 | "在这些示例中,`%s`表示Python应该将值转换为字符串.您可以使用多种转换类型来使用格式进行更具体化. 详见字符串格式化 [文档](https://docs.python.org/2/library/stdtypes.html#string-formatting) 这里有更多的示例和更完整的细节以供参考." 1922 | ] 1923 | }, 1924 | { 1925 | "cell_type": "markdown", 1926 | "metadata": {}, 1927 | "source": [ 1928 | "## Logical Operators\n", 1929 | "### Basic Logic\n", 1930 | "\n", 1931 | "Logical operators deal with `boolean` values, as we briefly covered before. If you recall, a `bool` takes on one of two values, `True` or `False` (or $1$ or $0$). The basic logical statements that we can make are defined using the built-in comparators. These are `==` (equal), `!=` (not equal), `<` (less than), `>` (greater than), `<=` (less than or equal to), and `>=` (greater than or equal to)." 1932 | ] 1933 | }, 1934 | { 1935 | "cell_type": "markdown", 1936 | "metadata": {}, 1937 | "source": [ 1938 | "## 逻辑运算符\n", 1939 | "### 基础的逻辑运算符\n", 1940 | "\n", 1941 | "逻辑运算符主要处理的是`boolean`变量,就像我们之前提过的那样.如果你还记得,`bool`只能是`True`或者 `False`(or $1$ or $0$).我们可以使用内置的比较运算符来定义基本的逻辑语句.这里有 `==` (等于号), `!=` (不等号), `<` (小于号), `>` (大于号), `<=` (小于等于号), 和 `>=` (大于等于号)." 1942 | ] 1943 | }, 1944 | { 1945 | "cell_type": "code", 1946 | "execution_count": null, 1947 | "metadata": { 1948 | "collapsed": true 1949 | }, 1950 | "outputs": [], 1951 | "source": [ 1952 | "print(5 == 5)" 1953 | ] 1954 | }, 1955 | { 1956 | "cell_type": "code", 1957 | "execution_count": null, 1958 | "metadata": { 1959 | "collapsed": true 1960 | }, 1961 | "outputs": [], 1962 | "source": [ 1963 | "print(5 > 5)" 1964 | ] 1965 | }, 1966 | { 1967 | "cell_type": "markdown", 1968 | "metadata": {}, 1969 | "source": [ 1970 | "These comparators also work in conjunction with variables." 1971 | ] 1972 | }, 1973 | { 1974 | "cell_type": "markdown", 1975 | "metadata": {}, 1976 | "source": [ 1977 | "这些比较运算符同样能作用于变量." 1978 | ] 1979 | }, 1980 | { 1981 | "cell_type": "code", 1982 | "execution_count": null, 1983 | "metadata": { 1984 | "collapsed": true 1985 | }, 1986 | "outputs": [], 1987 | "source": [ 1988 | "m = 2\n", 1989 | "n = 23\n", 1990 | "print(m < n)" 1991 | ] 1992 | }, 1993 | { 1994 | "cell_type": "markdown", 1995 | "metadata": {}, 1996 | "source": [ 1997 | "We can string these comparators together to make more complex logical statements using the logical operators `or`, `and`, and `not`. " 1998 | ] 1999 | }, 2000 | { 2001 | "cell_type": "markdown", 2002 | "metadata": {}, 2003 | "source": [ 2004 | "我们可以将这些比较运算符进行组合,以表现更复杂的逻辑语句" 2005 | ] 2006 | }, 2007 | { 2008 | "cell_type": "code", 2009 | "execution_count": null, 2010 | "metadata": { 2011 | "collapsed": true 2012 | }, 2013 | "outputs": [], 2014 | "source": [ 2015 | "statement_1 = 10 > 2\n", 2016 | "statement_2 = 4 <= 6\n", 2017 | "print(\"Statement 1 truth value: {0}\".format(statement_1))\n", 2018 | "print(\"Statement 2 truth value: {0}\".format(statement_2))\n", 2019 | "print(\"Statement 1 and Statement 2: {0}\".format(statement_1 and statement_2))" 2020 | ] 2021 | }, 2022 | { 2023 | "cell_type": "markdown", 2024 | "metadata": {}, 2025 | "source": [ 2026 | "The `or` operator performs a logical `or` calculation. This is an inclusive `or`, so if either component paired together by `or` is `True`, the whole statement will be `True`. The `and` statement only outputs `True` if all components that are `and`ed together are True. Otherwise it will output `False`. The `not` statement simply inverts the truth value of whichever statement follows it. So a `True` statement will be evaluated as `False` when a `not` is placed in front of it. Similarly, a `False` statement will become `True` when a `not` is in front of it.\n", 2027 | "\n", 2028 | "Say that we have two logical statements, or assertions, $P$ and $Q$. The truth table for the basic logical operators is as follows:\n", 2029 | "\n", 2030 | "| P | Q | `not` P| P `and` Q | P `or` Q|\n", 2031 | "|:-----:|:-----:|:---:|:---:|:---:|\n", 2032 | "| `True` | `True` | `False` | `True` | `True` |\n", 2033 | "| `False` | `True` | `True` | `False` | `True` |\n", 2034 | "| `True` | `False` | `False` | `False` | `True` |\n", 2035 | "| `False` | `False` | `True` | `False` | `False` |\n", 2036 | "\n", 2037 | "We can string multiple logical statements together using the logical operators." 2038 | ] 2039 | }, 2040 | { 2041 | "cell_type": "markdown", 2042 | "metadata": {}, 2043 | "source": [ 2044 | "`or`操作符执行逻辑 `或` 运算. This is an inclusive `or`, `or`连接的语句中有一个为`True`, 那么整个语句为`True`. `and`语句为`True`当且仅当所有`and`连接的语句全为`True`. 否则就输出`False`. `not`语句只是简单地反转语句的真值. 所以一个 `True`语句经过`not`计算后变成 `False`. 相似的, 一个 `False` 语句经过`not`计算后变成`True`.\n", 2045 | "\n", 2046 | "比如这里有两个逻辑语句, 或者叫断言, $P$ 和 $Q$. 简单逻辑运算的真值表如下:\n", 2047 | "\n", 2048 | "| P | Q | `not` P| P `and` Q | P `or` Q|\n", 2049 | "|:-----:|:-----:|:---:|:---:|:---:|\n", 2050 | "| `True` | `True` | `False` | `True` | `True` |\n", 2051 | "| `False` | `True` | `True` | `False` | `True` |\n", 2052 | "| `True` | `False` | `False` | `False` | `True` |\n", 2053 | "| `False` | `False` | `True` | `False` | `False` |\n", 2054 | "\n", 2055 | "我们可以使用逻辑运算符将多个逻辑语句串在一起." 2056 | ] 2057 | }, 2058 | { 2059 | "cell_type": "code", 2060 | "execution_count": null, 2061 | "metadata": { 2062 | "collapsed": true 2063 | }, 2064 | "outputs": [], 2065 | "source": [ 2066 | "print(((2 < 3) and (3 > 0)) or ((5 > 6) and not (4 < 2)))" 2067 | ] 2068 | }, 2069 | { 2070 | "cell_type": "markdown", 2071 | "metadata": {}, 2072 | "source": [ 2073 | "Logical statements can be as simple or complex as we like, depending on what we need to express. Evaluating the above logical statement step by step we see that we are evaluating (`True and True`) `or` (`False and not False`). This becomes `True or (False and True`), subsequently becoming `True or False`, ultimately being evaluated as `True`." 2074 | ] 2075 | }, 2076 | { 2077 | "cell_type": "markdown", 2078 | "metadata": {}, 2079 | "source": [ 2080 | "逻辑语句可以很简单,也可以很复杂,这个取决于我们如何去表达.Logical statements can be as simple or complex as we like, depending on what we need to express. 上述的逻辑语句我们可以一步步去计算,首先计算最里层括号里的逻辑语句: (`True and True`) `or` (`False and not False`). 接下来: `True or (False and True`), 依次计算得到: `True or False`, 最终就会变成 `True`." 2081 | ] 2082 | }, 2083 | { 2084 | "cell_type": "markdown", 2085 | "metadata": {}, 2086 | "source": [ 2087 | "#### Truthiness\n", 2088 | "\n", 2089 | "Data types in Python have a fun characteristic called truthiness. What this means is that most built-in types will evaluate as either `True` or `False` when a boolean value is needed (such as with an if-statement). As a general rule, containers like strings, tuples, dictionaries, lists, and sets, will return `True` if they contain anything at all and `False` if they contain nothing." 2090 | ] 2091 | }, 2092 | { 2093 | "cell_type": "markdown", 2094 | "metadata": {}, 2095 | "source": [ 2096 | "#### Truthiness\n", 2097 | "\n", 2098 | "在Python中的数据类型有一种特性叫做`Truthiness`.当我们需要进行布尔值判断时,大多数内置数据类型都可以进行逻辑判断(比如一个if语句).通用的规则是,比如像字符串,元组,字典,列表和集合,只要里面元素个数不为0,则返回`Ture`,否则返回`False`." 2099 | ] 2100 | }, 2101 | { 2102 | "cell_type": "code", 2103 | "execution_count": null, 2104 | "metadata": { 2105 | "collapsed": true 2106 | }, 2107 | "outputs": [], 2108 | "source": [ 2109 | "# Similar to how float() and int() work, bool() forces a value to be considered a boolean!\n", 2110 | "print(bool(''))" 2111 | ] 2112 | }, 2113 | { 2114 | "cell_type": "code", 2115 | "execution_count": null, 2116 | "metadata": { 2117 | "collapsed": true 2118 | }, 2119 | "outputs": [], 2120 | "source": [ 2121 | "print(bool('I have character!'))" 2122 | ] 2123 | }, 2124 | { 2125 | "cell_type": "code", 2126 | "execution_count": null, 2127 | "metadata": { 2128 | "collapsed": true 2129 | }, 2130 | "outputs": [], 2131 | "source": [ 2132 | "print(bool([]))" 2133 | ] 2134 | }, 2135 | { 2136 | "cell_type": "code", 2137 | "execution_count": null, 2138 | "metadata": { 2139 | "collapsed": true 2140 | }, 2141 | "outputs": [], 2142 | "source": [ 2143 | "print(bool([1, 2, 3]))" 2144 | ] 2145 | }, 2146 | { 2147 | "cell_type": "markdown", 2148 | "metadata": {}, 2149 | "source": [ 2150 | "And so on, for the other collections and containers. `None` also evaluates as `False`. The number `1` is equivalent to `True` and the number `0` is equivalent to `False` as well, in a boolean context." 2151 | ] 2152 | }, 2153 | { 2154 | "cell_type": "markdown", 2155 | "metadata": {}, 2156 | "source": [ 2157 | "此外,对于其他的`collections`和`containers`,在需要判断布尔值的情况下,`None`也视为`False`.数字`1`视为`True`,数字`0`视为`False`." 2158 | ] 2159 | }, 2160 | { 2161 | "cell_type": "markdown", 2162 | "metadata": {}, 2163 | "source": [ 2164 | "### If-statements\n", 2165 | "\n", 2166 | "We can create segments of code that only execute if a set of conditions is met. We use if-statements in conjunction with logical statements in order to create branches in our code. \n", 2167 | "\n", 2168 | "An `if` block gets entered when the condition is considered to be `True`. If condition is evaluated as `False`, the `if` block will simply be skipped unless there is an `else` block to accompany it. Conditions are made using either logical operators or by using the truthiness of values in Python. An if-statement is defined with a colon and a block of indented text." 2169 | ] 2170 | }, 2171 | { 2172 | "cell_type": "markdown", 2173 | "metadata": {}, 2174 | "source": [ 2175 | "### If-statements(If语句)\n", 2176 | "\n", 2177 | "我们可以创建只在满足一定条件时执行的代码段.我们将if语句与逻辑语句相结合,以便在代码中创建分支.\n", 2178 | "\n", 2179 | "当条件被认为是`True`时,我们则进入`if`分支.当条件被认为是`False`时,我们则会忽略`if`分支,直到进入`else`分支.\n", 2180 | "在Python中用逻辑语句或者变量的真值作为条件进行判断,if语句用冒号和一段缩进文本定义." 2181 | ] 2182 | }, 2183 | { 2184 | "cell_type": "code", 2185 | "execution_count": null, 2186 | "metadata": { 2187 | "collapsed": true 2188 | }, 2189 | "outputs": [], 2190 | "source": [ 2191 | "# This is the basic format of an if statement. This is a vacuous example. \n", 2192 | "# The string \"Condition\" will always evaluated as True because it is a\n", 2193 | "# non-empty string. he purpose of this code is to show the formatting of\n", 2194 | "# an if-statement.\n", 2195 | "if \"Condition\": \n", 2196 | " # This block of code will execute because the string is non-empty\n", 2197 | " # Everything on these indented lines\n", 2198 | " print(True)\n", 2199 | "else:\n", 2200 | " # So if the condition that we examined with if is in fact False\n", 2201 | " # This block of code will execute INSTEAD of the first block of code\n", 2202 | " # Everything on these indented lines\n", 2203 | " print(False)\n", 2204 | "# The else block here will never execute because \"Condition\" is a non-empty string." 2205 | ] 2206 | }, 2207 | { 2208 | "cell_type": "code", 2209 | "execution_count": null, 2210 | "metadata": { 2211 | "collapsed": true 2212 | }, 2213 | "outputs": [], 2214 | "source": [ 2215 | "i = 4\n", 2216 | "if i == 5:\n", 2217 | " print('The variable i has a value of 5')" 2218 | ] 2219 | }, 2220 | { 2221 | "cell_type": "markdown", 2222 | "metadata": {}, 2223 | "source": [ 2224 | "Because in this example `i = 4` and the if-statement is only looking for whether `i` is equal to `5`, the print statement will never be executed. We can add in an `else` statement to create a contingency block of code in case the condition in the if-statement is not evaluated as `True`." 2225 | ] 2226 | }, 2227 | { 2228 | "cell_type": "markdown", 2229 | "metadata": {}, 2230 | "source": [ 2231 | "这个示例中`i = 4`,if语句只是判断`i`是否等于`5`,所以print语句永远不会执行.为了应对这种if条件判断永远不为`True`的情况下,我们可以使用`else`语句创建一个代码块应对这种情况." 2232 | ] 2233 | }, 2234 | { 2235 | "cell_type": "code", 2236 | "execution_count": null, 2237 | "metadata": { 2238 | "collapsed": true 2239 | }, 2240 | "outputs": [], 2241 | "source": [ 2242 | "i = 4\n", 2243 | "if i == 5:\n", 2244 | " print(\"All lines in this indented block are part of this block\")\n", 2245 | " print('The variable i has a value of 5')\n", 2246 | "else:\n", 2247 | " print(\"All lines in this indented block are part of this block\")\n", 2248 | " print('The variable i is not equal to 5')" 2249 | ] 2250 | }, 2251 | { 2252 | "cell_type": "markdown", 2253 | "metadata": {}, 2254 | "source": [ 2255 | "We can implement other branches off of the same if-statement by using `elif`, an abbreviation of \"else if\". We can include as many `elifs` as we like until we have exhausted all the logical branches of a condition." 2256 | ] 2257 | }, 2258 | { 2259 | "cell_type": "markdown", 2260 | "metadata": {}, 2261 | "source": [ 2262 | "我们可以添加和if语句功能一样的`elif`语句,他是`else if`的缩写.我们能添加任意多个`elifs`语句,直到我们穷尽了所有逻辑分支条件." 2263 | ] 2264 | }, 2265 | { 2266 | "cell_type": "code", 2267 | "execution_count": null, 2268 | "metadata": { 2269 | "collapsed": true 2270 | }, 2271 | "outputs": [], 2272 | "source": [ 2273 | "i = 1\n", 2274 | "if i == 1:\n", 2275 | " print('The variable i has a value of 1')\n", 2276 | "elif i == 2:\n", 2277 | " print('The variable i has a value of 2')\n", 2278 | "elif i == 3:\n", 2279 | " print('The variable i has a value of 3')\n", 2280 | "else:\n", 2281 | " print(\"I don't care what i is\")" 2282 | ] 2283 | }, 2284 | { 2285 | "cell_type": "markdown", 2286 | "metadata": {}, 2287 | "source": [ 2288 | "You can also nest if-statements within if-statements to check for further conditions." 2289 | ] 2290 | }, 2291 | { 2292 | "cell_type": "markdown", 2293 | "metadata": {}, 2294 | "source": [ 2295 | "如果是多条件判断,我们可以使用嵌套形式的if语句" 2296 | ] 2297 | }, 2298 | { 2299 | "cell_type": "code", 2300 | "execution_count": null, 2301 | "metadata": { 2302 | "collapsed": true 2303 | }, 2304 | "outputs": [], 2305 | "source": [ 2306 | "i = 10\n", 2307 | "if i % 2 == 0:\n", 2308 | " if i % 3 == 0:\n", 2309 | " print('i is divisible by both 2 and 3! Wow!')\n", 2310 | " elif i % 5 == 0:\n", 2311 | " print('i is divisible by both 2 and 5! Wow!')\n", 2312 | " else:\n", 2313 | " print('i is divisible by 2, but not 3 or 5. Meh.')\n", 2314 | "else:\n", 2315 | " print('I guess that i is an odd number. Boring.')" 2316 | ] 2317 | }, 2318 | { 2319 | "cell_type": "markdown", 2320 | "metadata": {}, 2321 | "source": [ 2322 | "Remember that we can group multiple conditions together by using the logical operators!" 2323 | ] 2324 | }, 2325 | { 2326 | "cell_type": "markdown", 2327 | "metadata": {}, 2328 | "source": [ 2329 | "请记住,我们同样能逻辑操作符进行多条件判断!" 2330 | ] 2331 | }, 2332 | { 2333 | "cell_type": "code", 2334 | "execution_count": null, 2335 | "metadata": { 2336 | "collapsed": true 2337 | }, 2338 | "outputs": [], 2339 | "source": [ 2340 | "i = 5\n", 2341 | "j = 12\n", 2342 | "if i < 10 and j > 11:\n", 2343 | " print('{0} is less than 10 and {1} is greater than 11! How novel and interesting!'.format(i, j))" 2344 | ] 2345 | }, 2346 | { 2347 | "cell_type": "markdown", 2348 | "metadata": {}, 2349 | "source": [ 2350 | "You can use the logical comparators to compare strings!" 2351 | ] 2352 | }, 2353 | { 2354 | "cell_type": "markdown", 2355 | "metadata": {}, 2356 | "source": [ 2357 | "你能用逻辑操作符来比较字符串!" 2358 | ] 2359 | }, 2360 | { 2361 | "cell_type": "code", 2362 | "execution_count": null, 2363 | "metadata": { 2364 | "collapsed": true 2365 | }, 2366 | "outputs": [], 2367 | "source": [ 2368 | "my_string = \"Carthago delenda est\"\n", 2369 | "if my_string == \"Carthago delenda est\":\n", 2370 | " print('And so it was! For the glory of Rome!')\n", 2371 | "else:\n", 2372 | " print('War elephants are TERRIFYING. I am staying home.')" 2373 | ] 2374 | }, 2375 | { 2376 | "cell_type": "markdown", 2377 | "metadata": {}, 2378 | "source": [ 2379 | "As with other data types, `==` will check for whether the two things on either side of it have the same value. In this case, we compare whether the value of the strings are the same. Using `>` or `<` or any of the other comparators is not quite so intuitive, however, so we will stay from using comparators with strings in this lecture. Comparators will examine the [lexicographical order](https://en.wikipedia.org/wiki/Lexicographical_order) of the strings, which might be a bit more in-depth than you might like." 2380 | ] 2381 | }, 2382 | { 2383 | "cell_type": "markdown", 2384 | "metadata": {}, 2385 | "source": [ 2386 | "对于其他数据类型, `==`同样能检测等号两边参数值是否相等. 在这个示例中,我们比较两个字符串的值是否相等. 然而对于 `>` 或者 `<` 或者其他比较操作符就不是那么直观了,因此在本次课程中我们将继续用字符串作为比较操作符的操作对象. 如果你想更深入了解下字符串的比较操作可以查阅 [lexicographical order](https://en.wikipedia.org/wiki/Lexicographical_order) ." 2387 | ] 2388 | }, 2389 | { 2390 | "cell_type": "markdown", 2391 | "metadata": {}, 2392 | "source": [ 2393 | "Some built-in functions return a boolean value, so they can be used as conditions in an if-statement. User-defined functions can also be constructed so that they return a boolean value. This will be covered later with function definition!\n", 2394 | "\n", 2395 | "The `in` keyword is generally used to check membership of a value within another value. We can check memebership in the context of an if-statement and use it to output a truth value." 2396 | ] 2397 | }, 2398 | { 2399 | "cell_type": "markdown", 2400 | "metadata": {}, 2401 | "source": [ 2402 | "一些内置函数会返回布尔值,所以他们也可以被当做if语句的判断条件使用.能返回布尔值的用户自定义函数也可以这样操作.我们在后面会介绍自定义函数!\n", 2403 | "\n", 2404 | "`in`关键字通常用于检查一个值在另一个值中的成员关系.我们可以把这个成员检查的结果作为if语句条件判断." 2405 | ] 2406 | }, 2407 | { 2408 | "cell_type": "code", 2409 | "execution_count": null, 2410 | "metadata": { 2411 | "collapsed": true 2412 | }, 2413 | "outputs": [], 2414 | "source": [ 2415 | "if 'a' in my_string or 'e' in my_string:\n", 2416 | " print('Those are my favorite vowels!')" 2417 | ] 2418 | }, 2419 | { 2420 | "cell_type": "markdown", 2421 | "metadata": {}, 2422 | "source": [ 2423 | "Here we use `in` to check whether the variable `my_string` contains any particular letters. We will later use `in` to iterate through lists!" 2424 | ] 2425 | }, 2426 | { 2427 | "cell_type": "markdown", 2428 | "metadata": {}, 2429 | "source": [ 2430 | "这里我们使用`in`检查变量`my_string`是否包含某个特定字母在里面.稍后我们会使用`in`对list进行循环." 2431 | ] 2432 | }, 2433 | { 2434 | "cell_type": "markdown", 2435 | "metadata": {}, 2436 | "source": [ 2437 | "## Loop Structures\n", 2438 | "\n", 2439 | "Loop structures are one of the most important parts of programming. The `for` loop and the `while` loop provide a way to repeatedly run a block of code repeatedly. A `while` loop will iterate until a certain condition has been met. If at any point after an iteration that condition is no longer satisfied, the loop terminates. A `for` loop will iterate over a sequence of values and terminate when the sequence has ended. You can instead include conditions within the `for` loop to decide whether it should terminate early or you could simply let it run its course." 2440 | ] 2441 | }, 2442 | { 2443 | "cell_type": "markdown", 2444 | "metadata": {}, 2445 | "source": [ 2446 | "## Loop Structures(循环结构)\n", 2447 | "\n", 2448 | "循环结构是程序设计的重要组成部分.`for`循环和`while`循环提供了重复运行一段代码的方法.一个`while`循环将不断迭代直到某个条件满足为止.如果在迭代之后,条件不再满足,则循环终止. 一个`for`循环将在一系列值上遍历,并在序列结束时终止.相反的,你可以在 `for`循环中利用一些条件提前中止,或者可以让他简单的运行(You can instead include conditions within the `for` loop to decide whether it should terminate early or you could simply let it run its course.)" 2449 | ] 2450 | }, 2451 | { 2452 | "cell_type": "code", 2453 | "execution_count": null, 2454 | "metadata": { 2455 | "collapsed": true 2456 | }, 2457 | "outputs": [], 2458 | "source": [ 2459 | "i = 5\n", 2460 | "while i > 0: # We can write this as 'while i:' because 0 is False!\n", 2461 | " i -= 1\n", 2462 | " print('I am looping! {0} more to go!'.format(i))" 2463 | ] 2464 | }, 2465 | { 2466 | "cell_type": "markdown", 2467 | "metadata": {}, 2468 | "source": [ 2469 | "\n", 2470 | "With `while` loops we need to make sure that something actually changes from iteration to iteration so that that the loop actually terminates. In this case, we use the shorthand `i -= 1` (short for `i = i - 1`) so that the value of `i` gets smaller with each iteration. Eventually `i` will be reduced to `0`, rendering the condition `False` and exiting the loop." 2471 | ] 2472 | }, 2473 | { 2474 | "cell_type": "markdown", 2475 | "metadata": {}, 2476 | "source": [ 2477 | "在`while`循环中,我们必须确保循环的判断条件会随着迭代而改变,这是为了避免无限死循环.在这个示例总,我们通过`i-=1`(i = i - 1的缩写),使得每次迭代后,`i`的值越来越小.当`i`等于0时,使得循环变为`False`,然后我们可以退出循环." 2478 | ] 2479 | }, 2480 | { 2481 | "cell_type": "markdown", 2482 | "metadata": {}, 2483 | "source": [ 2484 | "A `for` loop iterates a set number of times, determined when you state the entry into the loop. In this case we are iterating over the list returned from `range()`. The `for` loop selects a value from the list, in order, and temporarily assigns the value of `i` to it so that operations can be performed with the value." 2485 | ] 2486 | }, 2487 | { 2488 | "cell_type": "markdown", 2489 | "metadata": {}, 2490 | "source": [ 2491 | "当进入`for`循环后,会迭代一定次数.在这个示例中,我们通过`range`返回的列表进行迭代.每一次迭代,`for`循环会从列表中选择一个值赋值给`i`,以便后面用此值进行计算." 2492 | ] 2493 | }, 2494 | { 2495 | "cell_type": "code", 2496 | "execution_count": null, 2497 | "metadata": { 2498 | "collapsed": true 2499 | }, 2500 | "outputs": [], 2501 | "source": [ 2502 | "for i in range(5):\n", 2503 | " print('I am looping! I have looped {0} times!'.format(i + 1))" 2504 | ] 2505 | }, 2506 | { 2507 | "cell_type": "markdown", 2508 | "metadata": {}, 2509 | "source": [ 2510 | "Note that in this `for` loop we use the `in` keyword. Use of the `in` keyword is not limited to checking for membership as in the if-statement example. You can iterate over any collection with a `for` loop by using the `in` keyword.\n", 2511 | "\n", 2512 | "In this next example, we will iterate over a `set` because we want to check for containment and add to a new set." 2513 | ] 2514 | }, 2515 | { 2516 | "cell_type": "markdown", 2517 | "metadata": {}, 2518 | "source": [ 2519 | "在这个`for`循环中,我们使用了`in`关键字.`in`关键字不光能用于if语句中作条件判断,也能在`for`循环中,遍历任何集合.\n", 2520 | "\n", 2521 | "在下一个示例中,我们将会对一个`set`进行遍历,我们希望检测是否包含某个成员,如果在则添加到新的集合中." 2522 | ] 2523 | }, 2524 | { 2525 | "cell_type": "code", 2526 | "execution_count": null, 2527 | "metadata": { 2528 | "collapsed": true 2529 | }, 2530 | "outputs": [], 2531 | "source": [ 2532 | "my_list = {'cats', 'dogs', 'lizards', 'cows', 'bats', 'sponges', 'humans'} # Lists all the animals in the world\n", 2533 | "mammal_list = {'cats', 'dogs', 'cows', 'bats', 'humans'} # Lists all the mammals in the world\n", 2534 | "my_new_list = set()\n", 2535 | "for animal in my_list:\n", 2536 | " if animal in mammal_list:\n", 2537 | " # This adds any animal that is both in my_list and mammal_list to my_new_list\n", 2538 | " my_new_list.add(animal)\n", 2539 | " \n", 2540 | "print(my_new_list)" 2541 | ] 2542 | }, 2543 | { 2544 | "cell_type": "markdown", 2545 | "metadata": {}, 2546 | "source": [ 2547 | "There are two statements that are very helpful in dealing with both `for` and `while` loops. These are `break` and `continue`. If `break` is encountered at any point while a loop is executing, the loop will immediately end." 2548 | ] 2549 | }, 2550 | { 2551 | "cell_type": "markdown", 2552 | "metadata": {}, 2553 | "source": [ 2554 | "`break` 和 `continue`语句在循环语句中十分有用.当处于一个循环中,`break`语句被执行,循环会立即中止." 2555 | ] 2556 | }, 2557 | { 2558 | "cell_type": "code", 2559 | "execution_count": null, 2560 | "metadata": { 2561 | "collapsed": true 2562 | }, 2563 | "outputs": [], 2564 | "source": [ 2565 | "i = 10\n", 2566 | "while True:\n", 2567 | " if i == 14:\n", 2568 | " break\n", 2569 | " i += 1 # This is shorthand for i = i + 1. It increments i with each iteration.\n", 2570 | " print(i)" 2571 | ] 2572 | }, 2573 | { 2574 | "cell_type": "code", 2575 | "execution_count": null, 2576 | "metadata": { 2577 | "collapsed": true 2578 | }, 2579 | "outputs": [], 2580 | "source": [ 2581 | "for i in range(5):\n", 2582 | " if i == 2:\n", 2583 | " break\n", 2584 | " print(i)" 2585 | ] 2586 | }, 2587 | { 2588 | "cell_type": "markdown", 2589 | "metadata": {}, 2590 | "source": [ 2591 | "The `continue` statement will tell the loop to immediately end this iteration and continue onto the next iteration of the loop." 2592 | ] 2593 | }, 2594 | { 2595 | "cell_type": "markdown", 2596 | "metadata": {}, 2597 | "source": [ 2598 | "`continue`语句告诉循环当前的迭代要立即结束并进入到下一个迭代." 2599 | ] 2600 | }, 2601 | { 2602 | "cell_type": "code", 2603 | "execution_count": null, 2604 | "metadata": { 2605 | "collapsed": true 2606 | }, 2607 | "outputs": [], 2608 | "source": [ 2609 | "i = 0\n", 2610 | "while i < 5:\n", 2611 | " i += 1\n", 2612 | " if i == 3:\n", 2613 | " continue\n", 2614 | " print(i)" 2615 | ] 2616 | }, 2617 | { 2618 | "cell_type": "markdown", 2619 | "metadata": {}, 2620 | "source": [ 2621 | "This loop skips printing the number $3$ because of the `continue` statement that executes when we enter the if-statement. The code never sees the command to print the number $3$ because it has already moved to the next iteration. The `break` and `continue` statements are further tools to help you control the flow of your loops and, as a result, your code." 2622 | ] 2623 | }, 2624 | { 2625 | "cell_type": "markdown", 2626 | "metadata": {}, 2627 | "source": [ 2628 | "这个例子中循环没有输出数字3,因为当我if语句条件为真时,`continue`语句被执行,直接进入到下一个循环了,因此当程序执行的时候,'没看见过'这个`print`代码.`break` 和 `continue` 语句能进一步帮你掌控循环的走向,比如结果,代码等等." 2629 | ] 2630 | }, 2631 | { 2632 | "cell_type": "markdown", 2633 | "metadata": {}, 2634 | "source": [ 2635 | "The variable that we use to iterate over a loop will retain its value when the loop exits. Similarly, any variables defined within the context of the loop will continue to exist outside of it." 2636 | ] 2637 | }, 2638 | { 2639 | "cell_type": "markdown", 2640 | "metadata": {}, 2641 | "source": [ 2642 | "循环使用的变量在循环退出时保留其值.类似地,定义在循环中的任何变量都将继续存在于它之外." 2643 | ] 2644 | }, 2645 | { 2646 | "cell_type": "code", 2647 | "execution_count": null, 2648 | "metadata": { 2649 | "collapsed": true 2650 | }, 2651 | "outputs": [], 2652 | "source": [ 2653 | "for i in range(5):\n", 2654 | " loop_string = 'I transcend the loop!'\n", 2655 | " print('I am eternal! I am {0} and I exist everywhere!'.format(i))\n", 2656 | "\n", 2657 | "print('I persist! My value is {0}'.format(i))\n", 2658 | "print(loop_string)" 2659 | ] 2660 | }, 2661 | { 2662 | "cell_type": "markdown", 2663 | "metadata": {}, 2664 | "source": [ 2665 | "We can also iterate over a dictionary!" 2666 | ] 2667 | }, 2668 | { 2669 | "cell_type": "markdown", 2670 | "metadata": {}, 2671 | "source": [ 2672 | "我们同样能遍历一个字典." 2673 | ] 2674 | }, 2675 | { 2676 | "cell_type": "code", 2677 | "execution_count": null, 2678 | "metadata": { 2679 | "collapsed": true 2680 | }, 2681 | "outputs": [], 2682 | "source": [ 2683 | "my_dict = {'firstname' : 'Inigo', 'lastname' : 'Montoya', 'nemesis' : 'Rugen'}" 2684 | ] 2685 | }, 2686 | { 2687 | "cell_type": "code", 2688 | "execution_count": null, 2689 | "metadata": { 2690 | "collapsed": true 2691 | }, 2692 | "outputs": [], 2693 | "source": [ 2694 | "for key in my_dict:\n", 2695 | " print(key)" 2696 | ] 2697 | }, 2698 | { 2699 | "cell_type": "markdown", 2700 | "metadata": {}, 2701 | "source": [ 2702 | "If we just iterate over a dictionary without doing anything else, we will only get the keys. We can either use the keys to get the values, like so:" 2703 | ] 2704 | }, 2705 | { 2706 | "cell_type": "markdown", 2707 | "metadata": {}, 2708 | "source": [ 2709 | "如果我们遍历一个字典,没有特殊操作的话,我们仅仅是拿到他的`key`,当然我们也可以通过`key`取得对应的`value`." 2710 | ] 2711 | }, 2712 | { 2713 | "cell_type": "code", 2714 | "execution_count": null, 2715 | "metadata": { 2716 | "collapsed": true 2717 | }, 2718 | "outputs": [], 2719 | "source": [ 2720 | "for key in my_dict:\n", 2721 | " print(my_dict[key])" 2722 | ] 2723 | }, 2724 | { 2725 | "cell_type": "markdown", 2726 | "metadata": {}, 2727 | "source": [ 2728 | "Or we can use the `.items()` function to get both key and value at the same time." 2729 | ] 2730 | }, 2731 | { 2732 | "cell_type": "markdown", 2733 | "metadata": {}, 2734 | "source": [ 2735 | "或者我们用`.items()`同时取到`key`和`value`." 2736 | ] 2737 | }, 2738 | { 2739 | "cell_type": "code", 2740 | "execution_count": null, 2741 | "metadata": { 2742 | "collapsed": true 2743 | }, 2744 | "outputs": [], 2745 | "source": [ 2746 | "#for key, value in my_dict.iteritems(): # python 2 下可以这样调用\n", 2747 | "for key, value in my_dict.items(): # python 3 是这样的\n", 2748 | " print(key, ':', value)" 2749 | ] 2750 | }, 2751 | { 2752 | "cell_type": "markdown", 2753 | "metadata": {}, 2754 | "source": [ 2755 | "The `.items()` function creates a tuple of each key-value pair and the for loop stores unpacks that tuple into `key, value` on each separate execution of the loop!" 2756 | ] 2757 | }, 2758 | { 2759 | "cell_type": "markdown", 2760 | "metadata": {}, 2761 | "source": [ 2762 | "在每次循环中,`.items()`函数会以`tuple`的形式返回一个键值对,并且进行解压分别赋值给`key, value`." 2763 | ] 2764 | }, 2765 | { 2766 | "cell_type": "markdown", 2767 | "metadata": {}, 2768 | "source": [ 2769 | "## Functions\n", 2770 | "\n", 2771 | "A function is a reusable block of code that you can call repeatedly to make calculations, output data, or really do anything that you want. This is one of the key aspects of using a programming language. To add to the built-in functions in Python, you can define your own!" 2772 | ] 2773 | }, 2774 | { 2775 | "cell_type": "markdown", 2776 | "metadata": {}, 2777 | "source": [ 2778 | "## 函数\n", 2779 | "\n", 2780 | "函数是一个可重复使用的代码块,你可以反复调用它进行计算、输出数据,或者任何你想要做的事情.这就是为什么我们要使用编程语言的原因之一.你可以通过自定义函数,将它添加到Python的内置函数中!" 2781 | ] 2782 | }, 2783 | { 2784 | "cell_type": "code", 2785 | "execution_count": null, 2786 | "metadata": { 2787 | "collapsed": true 2788 | }, 2789 | "outputs": [], 2790 | "source": [ 2791 | "def hello_world():\n", 2792 | " \"\"\" Prints Hello, world! \"\"\"\n", 2793 | " print('Hello, world!')\n", 2794 | "\n", 2795 | "hello_world()" 2796 | ] 2797 | }, 2798 | { 2799 | "cell_type": "code", 2800 | "execution_count": null, 2801 | "metadata": { 2802 | "collapsed": true 2803 | }, 2804 | "outputs": [], 2805 | "source": [ 2806 | "for i in range(5):\n", 2807 | " hello_world()" 2808 | ] 2809 | }, 2810 | { 2811 | "cell_type": "markdown", 2812 | "metadata": {}, 2813 | "source": [ 2814 | "Functions are defined with `def`, a function name, a list of parameters, and a colon. Everything indented below the colon will be included in the definition of the function.\n", 2815 | "\n", 2816 | "We can have our functions do anything that you can do with a normal block of code. For example, our `hello_world()` function prints a string every time it is called. If we want to keep a value that a function calculates, we can define the function so that it will `return` the value we want. This is a very important feature of functions, as any variable defined purely within a function will not exist outside of it." 2817 | ] 2818 | }, 2819 | { 2820 | "cell_type": "markdown", 2821 | "metadata": {}, 2822 | "source": [ 2823 | "函数的定义由关键字`def`, 函数名, 一串参数和一个冒号组成.任何在冒号后面缩进的代码都属于函数的主体.\n", 2824 | "\n", 2825 | "我们可以在函数主体代码块中做任何事情.比如,我们的`hello_world()` 函数,每被调用一次,就打印一次字符串.当我们想保留函数中计算的数值,我们可以定义函数,并`return`我们想要传出的任何值.这是函数中一个非常重要的特性,任何在函数中定义的变量都不会存在于函数之外." 2826 | ] 2827 | }, 2828 | { 2829 | "cell_type": "code", 2830 | "execution_count": null, 2831 | "metadata": { 2832 | "collapsed": true 2833 | }, 2834 | "outputs": [], 2835 | "source": [ 2836 | "def see_the_scope():\n", 2837 | " in_function_string = \"I'm stuck in here!\"\n", 2838 | "\n", 2839 | "see_the_scope()\n", 2840 | "print(in_function_string) # 你会发现在函数外并不存在 in_function_string 这个变量." 2841 | ] 2842 | }, 2843 | { 2844 | "cell_type": "markdown", 2845 | "metadata": {}, 2846 | "source": [ 2847 | " The **scope** of a variable is the part of a block of code where that variable is tied to a particular value. Functions in Python have an enclosed scope, making it so that variables defined within them can only be accessed directly within them. If we pass those values to a return statement we can get them out of the function. This makes it so that the function call returns values so that you can store them in variables that have a greater scope.\n", 2848 | " \n", 2849 | "In this case specifically,including a return statement allows us to keep the string value that we define in the function." 2850 | ] 2851 | }, 2852 | { 2853 | "cell_type": "markdown", 2854 | "metadata": {}, 2855 | "source": [ 2856 | "在一个代码块中,赋给变量的值只能在该代码块中生效,这个也叫做变量的**作用域(scope)**.在Python中,函数拥有封闭的作用域,这样使得其中定义的变量只能在函数内部被直接访问,当我们把这些变量值传递给一个`return`语句,我们就能把这些变量的值从函数中取出来了.这使得函数调用返回值,以便将它们存储在具有较大作用域的变量中.(比如将函数的返回值赋给函数外的变量,函数外的变量就是有较大作用域的变量)\n", 2857 | " \n", 2858 | "在接下来的例子中,特别的有一个返回语句,能够允许我们保留函数中定义的字符串的值." 2859 | ] 2860 | }, 2861 | { 2862 | "cell_type": "code", 2863 | "execution_count": null, 2864 | "metadata": { 2865 | "collapsed": true 2866 | }, 2867 | "outputs": [], 2868 | "source": [ 2869 | "def free_the_scope():\n", 2870 | " in_function_string = \"Anything you can do I can do better!\"\n", 2871 | " return in_function_string\n", 2872 | "my_string = free_the_scope()\n", 2873 | "print(my_string)" 2874 | ] 2875 | }, 2876 | { 2877 | "cell_type": "markdown", 2878 | "metadata": {}, 2879 | "source": [ 2880 | "Just as we can get values out of a function, we can also put values into a function. We do this by defining our function with parameters." 2881 | ] 2882 | }, 2883 | { 2884 | "cell_type": "markdown", 2885 | "metadata": { 2886 | "collapsed": true 2887 | }, 2888 | "source": [ 2889 | "我们既然能从函数取出值,当然也能够给函数喂值.这种操作被称为定义函数的参数." 2890 | ] 2891 | }, 2892 | { 2893 | "cell_type": "code", 2894 | "execution_count": null, 2895 | "metadata": { 2896 | "collapsed": true 2897 | }, 2898 | "outputs": [], 2899 | "source": [ 2900 | "def multiply_by_five(x):\n", 2901 | " \"\"\" Multiplies an input number by 5 \"\"\"\n", 2902 | " return x * 5\n", 2903 | "\n", 2904 | "n = 4\n", 2905 | "print(n)\n", 2906 | "print(multiply_by_five(n))" 2907 | ] 2908 | }, 2909 | { 2910 | "cell_type": "markdown", 2911 | "metadata": {}, 2912 | "source": [ 2913 | "In this example we only had one parameter for our function, `x`. We can easily add more parameters, separating everything with a comma." 2914 | ] 2915 | }, 2916 | { 2917 | "cell_type": "markdown", 2918 | "metadata": {}, 2919 | "source": [ 2920 | "在这个示例中,我们的函数有一个参数`x`.我们通过`,`添加更多的参数." 2921 | ] 2922 | }, 2923 | { 2924 | "cell_type": "code", 2925 | "execution_count": null, 2926 | "metadata": { 2927 | "collapsed": true 2928 | }, 2929 | "outputs": [], 2930 | "source": [ 2931 | "def calculate_area(length, width):\n", 2932 | " \"\"\" Calculates the area of a rectangle \"\"\"\n", 2933 | " return length * width" 2934 | ] 2935 | }, 2936 | { 2937 | "cell_type": "code", 2938 | "execution_count": null, 2939 | "metadata": { 2940 | "collapsed": true 2941 | }, 2942 | "outputs": [], 2943 | "source": [ 2944 | "l = 5\n", 2945 | "w = 10\n", 2946 | "print('Area: ', calculate_area(l, w))\n", 2947 | "print('Length: ', l)\n", 2948 | "print('Width: ', w)" 2949 | ] 2950 | }, 2951 | { 2952 | "cell_type": "code", 2953 | "execution_count": null, 2954 | "metadata": { 2955 | "collapsed": true 2956 | }, 2957 | "outputs": [], 2958 | "source": [ 2959 | "def calculate_volume(length, width, depth):\n", 2960 | " \"\"\" Calculates the volume of a rectangular prism \"\"\"\n", 2961 | " return length * width * depth" 2962 | ] 2963 | }, 2964 | { 2965 | "cell_type": "markdown", 2966 | "metadata": {}, 2967 | "source": [ 2968 | "If we want to, we can define a function so that it takes an arbitrary number of parameters. We tell Python that we want this by using an asterisk (`*`)." 2969 | ] 2970 | }, 2971 | { 2972 | "cell_type": "markdown", 2973 | "metadata": {}, 2974 | "source": [ 2975 | "如果我们想定义一个可以接受任意数量的参数的函数,可以使用`*`." 2976 | ] 2977 | }, 2978 | { 2979 | "cell_type": "code", 2980 | "execution_count": null, 2981 | "metadata": { 2982 | "collapsed": true 2983 | }, 2984 | "outputs": [], 2985 | "source": [ 2986 | "def sum_values(*args):\n", 2987 | " sum_val = 0\n", 2988 | " for i in args:\n", 2989 | " sum_val += i\n", 2990 | " return sum_val" 2991 | ] 2992 | }, 2993 | { 2994 | "cell_type": "code", 2995 | "execution_count": null, 2996 | "metadata": { 2997 | "collapsed": true 2998 | }, 2999 | "outputs": [], 3000 | "source": [ 3001 | "print(sum_values(1, 2, 3))\n", 3002 | "print(sum_values(10, 20, 30, 40, 50))\n", 3003 | "print(sum_values(4, 2, 5, 1, 10, 249, 25, 24, 13, 6, 4))" 3004 | ] 3005 | }, 3006 | { 3007 | "cell_type": "markdown", 3008 | "metadata": {}, 3009 | "source": [ 3010 | "The time to use `*args` as a parameter for your function is when you do not know how many values may be passed to it, as in the case of our sum function. The asterisk in this case is the syntax that tells Python that you are going to pass an arbitrary number of parameters into your function. These parameters are stored in the form of a tuple." 3011 | ] 3012 | }, 3013 | { 3014 | "cell_type": "markdown", 3015 | "metadata": {}, 3016 | "source": [ 3017 | "当你不知道有多少参数会被传入到求和函数时,你就可以使用`*args`作为函数的参数.这个例子中的星号是告诉Python你要把任意数量的参数传递给函数的语法.这些参数以元组的形式存储." 3018 | ] 3019 | }, 3020 | { 3021 | "cell_type": "code", 3022 | "execution_count": null, 3023 | "metadata": { 3024 | "collapsed": true 3025 | }, 3026 | "outputs": [], 3027 | "source": [ 3028 | "def test_args(*args):\n", 3029 | " print(type(args))\n", 3030 | "\n", 3031 | "test_args(1, 2, 3, 4, 5, 6)" 3032 | ] 3033 | }, 3034 | { 3035 | "cell_type": "markdown", 3036 | "metadata": {}, 3037 | "source": [ 3038 | "We can put as many elements into the `args` tuple as we want to when we call the function. However, because `args` is a tuple, we cannot modify it after it has been created.\n", 3039 | "\n", 3040 | "The `args` name of the variable is purely by convention. You could just as easily name your parameter `*vars` or `*things`. You can treat the `args` tuple like you would any other tuple, easily accessing `arg`'s values and iterating over it, as in the above `sum_values(*args)` function." 3041 | ] 3042 | }, 3043 | { 3044 | "cell_type": "markdown", 3045 | "metadata": {}, 3046 | "source": [ 3047 | "当我们调用函数是,只要我们想,可以把任意多的元素放入到`args`元组中.然而,因为`args`是一个元组,当他创建之后,是不能修改的.\n", 3048 | "\n", 3049 | "`args`纯粹是约定俗成的名字,你可以叫任何名字`*vars` 或者 `*things`.你可以把`args`就当做普通的元组,很简单的进行访问`arg`的值或者对他进行遍历,也可以像上述`sum_values(*args)`函数那样进行操作." 3050 | ] 3051 | }, 3052 | { 3053 | "cell_type": "markdown", 3054 | "metadata": {}, 3055 | "source": [ 3056 | "Our functions can return any data type. This makes it easy for us to create functions that check for conditions that we might want to monitor.\n", 3057 | "\n", 3058 | "Here we define a function that returns a boolean value. We can easily use this in conjunction with if-statements and other situations that require a boolean." 3059 | ] 3060 | }, 3061 | { 3062 | "cell_type": "markdown", 3063 | "metadata": {}, 3064 | "source": [ 3065 | "函数能返回任何数据类型.这使我们能够轻松地创建我们要想要用于条件判断的函数." 3066 | ] 3067 | }, 3068 | { 3069 | "cell_type": "code", 3070 | "execution_count": null, 3071 | "metadata": { 3072 | "collapsed": true 3073 | }, 3074 | "outputs": [], 3075 | "source": [ 3076 | "def has_a_vowel(word):\n", 3077 | " \"\"\" \n", 3078 | " Checks to see whether a word contains a vowel \n", 3079 | " If it doesn't contain a conventional vowel, it\n", 3080 | " will check for the presence of 'y' or 'w'. Does\n", 3081 | " not check to see whether those are in the word\n", 3082 | " in a vowel context.\n", 3083 | " \"\"\"\n", 3084 | " vowel_list = ['a', 'e', 'i', 'o', 'u']\n", 3085 | " \n", 3086 | " for vowel in vowel_list:\n", 3087 | " if vowel in word:\n", 3088 | " return True\n", 3089 | " # If there is a vowel in the word, the function returns, preventing anything after this loop from running\n", 3090 | " return False" 3091 | ] 3092 | }, 3093 | { 3094 | "cell_type": "code", 3095 | "execution_count": null, 3096 | "metadata": { 3097 | "collapsed": true 3098 | }, 3099 | "outputs": [], 3100 | "source": [ 3101 | "my_word = 'catnapping'\n", 3102 | "if has_a_vowel(my_word):\n", 3103 | " print('How surprising, an english word contains a vowel.')\n", 3104 | "else:\n", 3105 | " print('This is actually surprising.')" 3106 | ] 3107 | }, 3108 | { 3109 | "cell_type": "code", 3110 | "execution_count": null, 3111 | "metadata": { 3112 | "collapsed": true 3113 | }, 3114 | "outputs": [], 3115 | "source": [ 3116 | "def point_maker(x, y):\n", 3117 | " \"\"\" Groups x and y values into a point, technically a tuple \"\"\"\n", 3118 | " return x, y" 3119 | ] 3120 | }, 3121 | { 3122 | "cell_type": "markdown", 3123 | "metadata": {}, 3124 | "source": [ 3125 | "This above function returns an ordered pair of the input parameters, stored as a tuple." 3126 | ] 3127 | }, 3128 | { 3129 | "cell_type": "markdown", 3130 | "metadata": {}, 3131 | "source": [ 3132 | "上面的函数返回一个输入参数的有序对,并作为元组存储." 3133 | ] 3134 | }, 3135 | { 3136 | "cell_type": "code", 3137 | "execution_count": null, 3138 | "metadata": { 3139 | "collapsed": true, 3140 | "scrolled": true 3141 | }, 3142 | "outputs": [], 3143 | "source": [ 3144 | "a = point_maker(0, 10)\n", 3145 | "b = point_maker(5, 3)\n", 3146 | "def calculate_slope(point_a, point_b):\n", 3147 | " \"\"\" Calculates the linear slope between two points \"\"\"\n", 3148 | " return (point_b[1] - point_a[1])/(point_b[0] - point_a[0])\n", 3149 | "print(\"The slope between a and b is {0}\".format(calculate_slope(a, b)))" 3150 | ] 3151 | }, 3152 | { 3153 | "cell_type": "markdown", 3154 | "metadata": {}, 3155 | "source": [ 3156 | "And that one calculates the slope between two points!" 3157 | ] 3158 | }, 3159 | { 3160 | "cell_type": "markdown", 3161 | "metadata": {}, 3162 | "source": [ 3163 | "可以用来计算两点的斜率" 3164 | ] 3165 | }, 3166 | { 3167 | "cell_type": "code", 3168 | "execution_count": null, 3169 | "metadata": { 3170 | "collapsed": true 3171 | }, 3172 | "outputs": [], 3173 | "source": [ 3174 | "print(\"The slope-intercept form of the line between a and b, using point a, is: y - {0} = {2}(x - {1})\".format(a[1], a[0], calculate_slope(a, b)))" 3175 | ] 3176 | }, 3177 | { 3178 | "cell_type": "markdown", 3179 | "metadata": {}, 3180 | "source": [ 3181 | "With the proper syntax, you can define functions to do whatever calculations you want. This makes them an indispensible part of programming in any language." 3182 | ] 3183 | }, 3184 | { 3185 | "cell_type": "markdown", 3186 | "metadata": {}, 3187 | "source": [ 3188 | "有了正确的语法,就可以定义函数来完成所需的任何计算.这使得它们成为任何编程语言中不可缺少的一部分" 3189 | ] 3190 | }, 3191 | { 3192 | "cell_type": "markdown", 3193 | "metadata": {}, 3194 | "source": [ 3195 | "## Next Steps\n", 3196 | "\n", 3197 | "This was a lot of material and there is still even more to cover! Make sure you play around with the cells in each notebook to accustom yourself to the syntax featured here and to figure out any limitations. If you want to delve even deeper into the material, the [documentation for Python](https://docs.python.org/2/) is all available online. We are in the process of developing a second part to this Python tutorial, designed to provide you with even more programming knowledge, so keep an eye on the [Quantopian Lectures Page](quantopian.com/lectures) and the [forums](quantopian.com/posts) for any new lectures." 3198 | ] 3199 | }, 3200 | { 3201 | "cell_type": "markdown", 3202 | "metadata": {}, 3203 | "source": [ 3204 | "## 下一步\n", 3205 | "\n", 3206 | "这里已经有很多材料,但是还有更多的内容需要去了解!请确保你能玩转notebook中每一个单元格中的内容,熟悉里面的语法,找出任何限制.如果你想深入的研究python,可以访问 [Python 3.6.3官方文档](https://docs.python.org/3.6/), [中文文档点我](http://python.usyiyi.cn/translate/python_352/index.html)\n", 3207 | "\n", 3208 | "我们正在开发Python教程的第二部分,旨在为您提供更多的编程知识,请关注 [Quantopian Lectures Page](quantopian.com/lectures)和[论坛](quantopian.com/posts)" 3209 | ] 3210 | } 3211 | ], 3212 | "metadata": { 3213 | "kernelspec": { 3214 | "display_name": "Python 3", 3215 | "language": "python", 3216 | "name": "python3" 3217 | }, 3218 | "language_info": { 3219 | "codemirror_mode": { 3220 | "name": "ipython", 3221 | "version": 3 3222 | }, 3223 | "file_extension": ".py", 3224 | "mimetype": "text/x-python", 3225 | "name": "python", 3226 | "nbconvert_exporter": "python", 3227 | "pygments_lexer": "ipython3", 3228 | "version": "3.5.4" 3229 | } 3230 | }, 3231 | "nbformat": 4, 3232 | "nbformat_minor": 1 3233 | } 3234 | -------------------------------------------------------------------------------- /03.Introduction to NumPy.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Introduction to NumPy\n", 8 | "by Maxwell Margenot\n", 9 | "\n", 10 | "Part of the Quantopian Lecture Series:\n", 11 | "\n", 12 | "* [www.quantopian.com/lectures](https://www.quantopian.com/lectures)\n", 13 | "* [github.com/quantopian/research_public](https://github.com/quantopian/research_public)\n", 14 | "\n", 15 | "Notebook released under the Creative Commons Attribution 4.0 License." 16 | ] 17 | }, 18 | { 19 | "cell_type": "markdown", 20 | "metadata": {}, 21 | "source": [ 22 | "NumPy is an incredibly powerful package in Python that is ubiquitous throughout the Quantopian platform. It has strong integration with Pandas, another tool we will be covering in the lecture series. NumPy adds support for multi-dimensional arrays and mathematical functions that allow you to easily perform linear algebra calculations. This lecture will be a collection of linear algebra examples computed using NumPy." 23 | ] 24 | }, 25 | { 26 | "cell_type": "markdown", 27 | "metadata": {}, 28 | "source": [ 29 | <<<<<<< HEAD 30 | "NumPy是一个非常强大的Python包,它几乎无处不在(不单单在Quantopian平台中)。它和Pandas是好朋友,我们将在另一门课中专门介绍Pandas. NumPy增加了对多维数组的支持,自带很多用于数学计算的函数,能让你轻松地进行线性代数的计算。\n", 31 | ======= 32 | "NumPy是一个非常强大的Python包,它几乎无处不在。它和Pandas是好朋友,我们将在另一门课中专门介绍Pandas. NumPy增加了对多维数组的支持,自带很多用于数学计算的函数,能让你轻松地进行线性代数的计算。\n", 33 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 34 | "本课将用Numpy为你展示一些它的应用。" 35 | ] 36 | }, 37 | { 38 | "cell_type": "code", 39 | "execution_count": null, 40 | "metadata": { 41 | "collapsed": true 42 | }, 43 | "outputs": [], 44 | "source": [ 45 | "import numpy as np\n", 46 | "import matplotlib.pyplot as plt\n", 47 | "\n", 48 | "%matplotlib inline" 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "metadata": {}, 54 | "source": [ 55 | "### Basic NumPy arrays" 56 | ] 57 | }, 58 | { 59 | "cell_type": "markdown", 60 | "metadata": {}, 61 | "source": [ 62 | "The most basic way that we could make use of NumPy in finance is calculation the mean return of a portfolio. Say that we have a list containing the historical return of several stocks." 63 | ] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "metadata": {}, 68 | "source": [ 69 | "### Numpy数组基础" 70 | ] 71 | }, 72 | { 73 | "cell_type": "markdown", 74 | "metadata": {}, 75 | "source": [ 76 | <<<<<<< HEAD 77 | "Numpy在金融中应用,最基础的应该是计算一个投资组合平均回报。\n", 78 | "我们用若干只股票的历史回报构成一个列表。" 79 | ======= 80 | "Numpy在金融中应用,最基础的应该是计算一个投资组合平均收益。\n", 81 | "假设这里有列表,他们由几只股票的历史数据构成。" 82 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 83 | ] 84 | }, 85 | { 86 | "cell_type": "code", 87 | "execution_count": null, 88 | "metadata": { 89 | "collapsed": true 90 | }, 91 | "outputs": [], 92 | "source": [ 93 | "stock_list = [3.5, 5, 2, 8, 4.2]" 94 | ] 95 | }, 96 | { 97 | "cell_type": "markdown", 98 | "metadata": {}, 99 | "source": [ 100 | "We can make an array by calling a function on the list:" 101 | ] 102 | }, 103 | { 104 | "cell_type": "markdown", 105 | "metadata": {}, 106 | "source": [ 107 | <<<<<<< HEAD 108 | "通过`numpy.array()`使得上述的list转换成`NumPy`中的数据结构:`numpy.ndarray`" 109 | ======= 110 | "将这个列表传入到numpy.array()函数中,将它变成numpy中的数组:numpy.ndarray" 111 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 112 | ] 113 | }, 114 | { 115 | "cell_type": "code", 116 | "execution_count": null, 117 | <<<<<<< HEAD 118 | "metadata": { 119 | "collapsed": true 120 | }, 121 | ======= 122 | "metadata": {}, 123 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 124 | "outputs": [], 125 | "source": [ 126 | "returns = np.array(stock_list)\n", 127 | "print(returns, type(returns))" 128 | ] 129 | }, 130 | { 131 | "cell_type": "markdown", 132 | "metadata": {}, 133 | "source": [ 134 | "You'll notice that the type of our array is 'ndarray', not just 'array'. This is because NumPy arrays can be created with multiple dimensions. If we pass np.array() a list of lists, it will create a 2-dimensional array. If pass a list of lists of lists, it will create a 3-dimensional array, and so on and so forth." 135 | ] 136 | }, 137 | { 138 | "cell_type": "markdown", 139 | "metadata": {}, 140 | "source": [ 141 | <<<<<<< HEAD 142 | "你可能注意到我们的数组类型是`ndarray`,而不是`array`.这是因为`NumPy`数组可以是多维的.如果我们给`np.array()`传递一个列表的列表,他将生成一个二维数组.如果传递的是一个列表的列表的列表,它将会生成一个三维数组,以此类推。" 143 | ======= 144 | "你可能注意到我们的数组类型是'ndarray',而不是'array'.这是因为Numpy数组可以是多维的.如果我们给np.array()传递一个列表的列表,他将生成一个二维数组.如果你传递的是一个列表的列表的列表,它将会生成一个三维数组,以此类推。" 145 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 146 | ] 147 | }, 148 | { 149 | "cell_type": "code", 150 | "execution_count": null, 151 | <<<<<<< HEAD 152 | "metadata": { 153 | "collapsed": true 154 | }, 155 | ======= 156 | "metadata": {}, 157 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 158 | "outputs": [], 159 | "source": [ 160 | "A = np.array([[1, 2], [3, 4]])\n", 161 | "print(A, type(A))" 162 | ] 163 | }, 164 | { 165 | "cell_type": "markdown", 166 | "metadata": {}, 167 | "source": [ 168 | "We can access the dimensions of an array by looking at its `shape` member variable." 169 | ] 170 | }, 171 | { 172 | "cell_type": "markdown", 173 | "metadata": {}, 174 | "source": [ 175 | <<<<<<< HEAD 176 | "我们可以通过`.shape`查看数组的维度。" 177 | ======= 178 | "我们可以通过`.shape`查看数组的维度的详细情况。" 179 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 180 | ] 181 | }, 182 | { 183 | "cell_type": "code", 184 | "execution_count": null, 185 | <<<<<<< HEAD 186 | "metadata": { 187 | "collapsed": true 188 | }, 189 | ======= 190 | "metadata": {}, 191 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 192 | "outputs": [], 193 | "source": [ 194 | "print(A.shape)" 195 | ] 196 | }, 197 | { 198 | "cell_type": "markdown", 199 | "metadata": {}, 200 | "source": [ 201 | "Arrays are indexed in much the same way as lists in Python. Elements of a list begin indexing from $0$ and end at $n - 1$, where $n$ is the length of the array." 202 | ] 203 | }, 204 | { 205 | "cell_type": "markdown", 206 | "metadata": {}, 207 | "source": [ 208 | "数组的索引和Python中的列表一样. 一个长度为$n$的列表的索引起始于 $0$ 结束于 $n - 1$." 209 | ] 210 | }, 211 | { 212 | "cell_type": "code", 213 | "execution_count": null, 214 | <<<<<<< HEAD 215 | "metadata": { 216 | "collapsed": true 217 | }, 218 | ======= 219 | "metadata": {}, 220 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 221 | "outputs": [], 222 | "source": [ 223 | "print(returns[0], returns[len(returns) - 1])" 224 | ] 225 | }, 226 | { 227 | "cell_type": "markdown", 228 | "metadata": {}, 229 | "source": [ 230 | "We can take a slice of an array using a colon, just like in a list." 231 | ] 232 | }, 233 | { 234 | "cell_type": "markdown", 235 | "metadata": {}, 236 | "source": [ 237 | "和list一样,我们用分号进行切片操作." 238 | ] 239 | }, 240 | { 241 | "cell_type": "code", 242 | "execution_count": null, 243 | <<<<<<< HEAD 244 | "metadata": { 245 | "collapsed": true 246 | }, 247 | ======= 248 | "metadata": {}, 249 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 250 | "outputs": [], 251 | "source": [ 252 | "print(returns[1:3])" 253 | ] 254 | }, 255 | { 256 | "cell_type": "markdown", 257 | "metadata": {}, 258 | "source": [ 259 | "A slice of an array, like in a list, will select a group of elements in the array starting from the first element indicated and going up to (but not including) the last element indicated.\n", 260 | "\n", 261 | "In the case of multidimensional arrays, many of the same conventions with slicing and indexing hold. We can access the first column of a 2-dimensional array like so:" 262 | ] 263 | }, 264 | { 265 | "cell_type": "markdown", 266 | "metadata": {}, 267 | "source": [ 268 | "数组的切片,也和列表一样,从指定的第一个元素开始,按照一定的步长(默认是1)选择元素,直到你指定的最后一个元素,但是不包含最后一个元素.\n", 269 | "在多维数组的情形中,情况也是类似的。接下来的例子中,我们可以这样去访问一个二维数组的第一列:" 270 | ] 271 | }, 272 | { 273 | "cell_type": "code", 274 | "execution_count": null, 275 | <<<<<<< HEAD 276 | "metadata": { 277 | "collapsed": true 278 | }, 279 | ======= 280 | "metadata": {}, 281 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 282 | "outputs": [], 283 | "source": [ 284 | "print(A[:, 0])" 285 | ] 286 | }, 287 | { 288 | "cell_type": "markdown", 289 | "metadata": {}, 290 | "source": [ 291 | "And the first row of a 2-dimensional array like so:" 292 | ] 293 | }, 294 | { 295 | "cell_type": "markdown", 296 | "metadata": {}, 297 | "source": [ 298 | "访问数组首行:" 299 | ] 300 | }, 301 | { 302 | "cell_type": "code", 303 | "execution_count": null, 304 | <<<<<<< HEAD 305 | "metadata": { 306 | "collapsed": true 307 | }, 308 | ======= 309 | "metadata": {}, 310 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 311 | "outputs": [], 312 | "source": [ 313 | "print(A[0, :])" 314 | ] 315 | }, 316 | { 317 | "cell_type": "markdown", 318 | "metadata": {}, 319 | "source": [ 320 | "Notice that each slice of the array returns yet another array!" 321 | ] 322 | }, 323 | { 324 | "cell_type": "markdown", 325 | "metadata": {}, 326 | "source": [ 327 | "注意每次通过切片返回的是另一个numpy数组!" 328 | ] 329 | }, 330 | { 331 | "cell_type": "code", 332 | "execution_count": null, 333 | <<<<<<< HEAD 334 | "metadata": { 335 | "collapsed": true 336 | }, 337 | ======= 338 | "metadata": {}, 339 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 340 | "outputs": [], 341 | "source": [ 342 | "print(type(A[0,:]))" 343 | ] 344 | }, 345 | { 346 | "cell_type": "markdown", 347 | "metadata": {}, 348 | "source": [ 349 | "Passing only one index to a 2-dimensional array will result in returning the row with the given index as well, providing us with another way to access individual rows." 350 | ] 351 | }, 352 | { 353 | "cell_type": "markdown", 354 | "metadata": {}, 355 | "source": [ 356 | "如果二维数组只用一个索引,得到的将是某行数据。" 357 | ] 358 | }, 359 | { 360 | "cell_type": "code", 361 | "execution_count": null, 362 | <<<<<<< HEAD 363 | "metadata": { 364 | "collapsed": true 365 | }, 366 | ======= 367 | "metadata": {}, 368 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 369 | "outputs": [], 370 | "source": [ 371 | "print(A[0])" 372 | ] 373 | }, 374 | { 375 | "cell_type": "markdown", 376 | "metadata": {}, 377 | "source": [ 378 | "Accessing the index of an individual element will return only the element." 379 | ] 380 | }, 381 | { 382 | "cell_type": "markdown", 383 | "metadata": { 384 | "collapsed": true 385 | }, 386 | "source": [ 387 | <<<<<<< HEAD 388 | "访问单个元素,返回的只是你索引的元素(而不是一个numpy数组)." 389 | ======= 390 | "访问单个元素,最后返回是你索引的数据(而不是一个numpy数组)" 391 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 392 | ] 393 | }, 394 | { 395 | "cell_type": "code", 396 | "execution_count": null, 397 | <<<<<<< HEAD 398 | "metadata": { 399 | "collapsed": true 400 | }, 401 | ======= 402 | "metadata": {}, 403 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 404 | "outputs": [], 405 | "source": [ 406 | "print(A[1, 1])" 407 | ] 408 | }, 409 | { 410 | "cell_type": "markdown", 411 | "metadata": {}, 412 | "source": [ 413 | "#### Array functions\n", 414 | "\n", 415 | "Functions built into NumPy can be easily called on arrays. Most functions are applied to an array element-wise (as scalar multiplication is). For example, if we call `log()` on an array, the logarithm will be taken of each element." 416 | ] 417 | }, 418 | { 419 | "cell_type": "markdown", 420 | "metadata": {}, 421 | "source": [ 422 | "数组函数\n", 423 | "numpy内置函数使用起来特别简单,大多数函数都是直接作用于元素层面的,我们来看下np.log()例子,自己感受下吧." 424 | ] 425 | }, 426 | { 427 | "cell_type": "code", 428 | "execution_count": null, 429 | <<<<<<< HEAD 430 | "metadata": { 431 | "collapsed": true 432 | }, 433 | ======= 434 | "metadata": {}, 435 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 436 | "outputs": [], 437 | "source": [ 438 | "print(np.log(returns))" 439 | ] 440 | }, 441 | { 442 | "cell_type": "markdown", 443 | "metadata": {}, 444 | "source": [ 445 | "Some functions return a single value. This is because they treat the array as a collection (similar to a list), performing the designated function. For example, the `mean()` function will do exactly what you expect, calculating the mean of an array." 446 | ] 447 | }, 448 | { 449 | "cell_type": "markdown", 450 | "metadata": {}, 451 | "source": [ 452 | <<<<<<< HEAD 453 | "有些函数会返回一个单独的值,他们把array当做一个整体来看待,比如说`np.mean()`函数,它会计算一个数组的平均值。" 454 | ======= 455 | "有些函数会返回一个单独的值,他们把array当做一个整体来看待,比如说np.mean()函数,它会计算一个数组的平均值。" 456 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 457 | ] 458 | }, 459 | { 460 | "cell_type": "code", 461 | "execution_count": null, 462 | <<<<<<< HEAD 463 | "metadata": { 464 | "collapsed": true 465 | }, 466 | ======= 467 | "metadata": {}, 468 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 469 | "outputs": [], 470 | "source": [ 471 | "print(np.mean(returns))" 472 | ] 473 | }, 474 | { 475 | "cell_type": "markdown", 476 | "metadata": {}, 477 | "source": [ 478 | "Or the `max()` function will return the maximum element of an array." 479 | ] 480 | }, 481 | { 482 | "cell_type": "markdown", 483 | "metadata": {}, 484 | "source": [ 485 | <<<<<<< HEAD 486 | "np.max()会计算数组最大值" 487 | ======= 488 | "np.max()会计算数组最大的值" 489 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 490 | ] 491 | }, 492 | { 493 | "cell_type": "code", 494 | "execution_count": null, 495 | <<<<<<< HEAD 496 | "metadata": { 497 | "collapsed": true 498 | }, 499 | ======= 500 | "metadata": {}, 501 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 502 | "outputs": [], 503 | "source": [ 504 | "print(np.max(returns))" 505 | ] 506 | }, 507 | { 508 | "cell_type": "markdown", 509 | "metadata": {}, 510 | "source": [ 511 | "For further reading on the universal functions in NumPy, check out the [documentation](https://docs.scipy.org/doc/numpy/user/quickstart.html#universal-functions)." 512 | ] 513 | }, 514 | { 515 | "cell_type": "markdown", 516 | "metadata": {}, 517 | "source": [ 518 | "更多的介绍请参考[文档](https://docs.scipy.org/doc/numpy/user/quickstart.html#universal-functions)." 519 | ] 520 | }, 521 | { 522 | "cell_type": "markdown", 523 | "metadata": {}, 524 | "source": [ 525 | "### Return to the returns\n", 526 | "\n", 527 | "Now let's modify our returns array with scalar values. If we add a scalar value to an array it will be added to every element of the array. If we multiply an array by a scalar value it will be multiplied against every element of the array. If we do both, both will happen!" 528 | ] 529 | }, 530 | { 531 | "cell_type": "markdown", 532 | "metadata": {}, 533 | "source": [ 534 | "### Return to the returns\n", 535 | <<<<<<< HEAD 536 | "现在我们用标量和returns做计算,比如做加法,那么这个标量将会和return的每个元素进行相加。乘法也是一样,我们来试试同时进行乘法和加法会发生什么!" 537 | ======= 538 | "现在我们用标量和returns做计算,比如做加法,那么这个标量将会和return的每个元素进行相加。乘法也是一样,我们来试试看。" 539 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 540 | ] 541 | }, 542 | { 543 | "cell_type": "code", 544 | "execution_count": null, 545 | "metadata": { 546 | <<<<<<< HEAD 547 | "collapsed": true, 548 | ======= 549 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 550 | "scrolled": false 551 | }, 552 | "outputs": [], 553 | "source": [ 554 | "returns*2 + 5" 555 | ] 556 | }, 557 | { 558 | "cell_type": "markdown", 559 | "metadata": {}, 560 | "source": [ 561 | "NumPy also has functions specifically built to operate on arrays. Let's take the mean and standard deviation of this group of returns." 562 | ] 563 | }, 564 | { 565 | "cell_type": "markdown", 566 | "metadata": {}, 567 | "source": [ 568 | <<<<<<< HEAD 569 | "numpy有专门作用于数组的函数,让我们来计算下这组回报的均值和标准差。" 570 | ======= 571 | "numpy有专门作用于数组的函数,比如np.std(),计算数组的标准差。" 572 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 573 | ] 574 | }, 575 | { 576 | "cell_type": "code", 577 | "execution_count": null, 578 | <<<<<<< HEAD 579 | "metadata": { 580 | "collapsed": true 581 | }, 582 | ======= 583 | "metadata": {}, 584 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 585 | "outputs": [], 586 | "source": [ 587 | "print(\"Mean: \", np.mean(returns), \"Std Dev: \", np.std(returns))" 588 | ] 589 | }, 590 | { 591 | "cell_type": "markdown", 592 | "metadata": {}, 593 | "source": [ 594 | "Let's simulate a universe of stocks using NumPy's functions. First we need to create the arrays to hold the assets and returns that we will use to build a portfolio. This is because arrays are created with a fixed size. Their dimensions can't be changed without creating a new array." 595 | ] 596 | }, 597 | { 598 | "cell_type": "markdown", 599 | "metadata": {}, 600 | "source": [ 601 | <<<<<<< HEAD 602 | "我们用numpy的函数来虚拟一个股票池。首先我们要创建一个数组,用来存储资产和回报的数据,然后用这个去创建一个投资组合。NumPy数组的尺寸是不能改变的(可以改变形状)。" 603 | ======= 604 | "我们用numpy的函数来模拟一个股票池。首先我们要创建一个数组,用来存储资产和收益的数据,然后用这个去创建一个投资组合。数组的尺寸是不能改变的(可以改变形状)。" 605 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 606 | ] 607 | }, 608 | { 609 | "cell_type": "code", 610 | "execution_count": null, 611 | "metadata": { 612 | "collapsed": true 613 | }, 614 | "outputs": [], 615 | "source": [ 616 | "N = 10\n", 617 | "assets = np.zeros((N, 100))\n", 618 | "returns = np.zeros((N, 100))" 619 | ] 620 | }, 621 | { 622 | "cell_type": "markdown", 623 | "metadata": {}, 624 | "source": [ 625 | "This function, `zeroes()`, creates a NumPy array with the given dimensions that is entirely filled in with $0$. We can pass a single value or a tuple of as many dimensions as we like. Passing in the tuple `(N, 100)`, will return a two-dimensional array with $N$ rows and $100$ columns. Our result is a $N \\times 100$ array.\n", 626 | "\n", 627 | "Now we will simulate a base asset. We want the universe of stocks to be correlated with each other so we will use this initial value to generate the others." 628 | ] 629 | }, 630 | { 631 | "cell_type": "markdown", 632 | "metadata": {}, 633 | "source": [ 634 | <<<<<<< HEAD 635 | "np.zeros顾名思义,给定维度,创建一个全是0的矩阵,函数的参数是一个标量或者一个tuple(元组),比如说(N,100),那么生成的就是10(N=10)行100列的二维数组。\n", 636 | "\n", 637 | "我们首先随机生成一个基准资产值,股票池中的股票都以这个基准值设定他们的初始净值,这样做能让股票之间保持**相关性**。" 638 | ======= 639 | "np.zeros顾名思义,就是创建一个全是0的矩阵,函数的参数是一个标量或者一个tuple(元组),比如说(N,100),那么生成的就是10行100列的矩阵。\n", 640 | "现在我们给每只股票设定一个初始的净值,以供后面使用。" 641 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 642 | ] 643 | }, 644 | { 645 | "cell_type": "code", 646 | "execution_count": null, 647 | "metadata": { 648 | "collapsed": true 649 | }, 650 | "outputs": [], 651 | "source": [ 652 | "np.random.seed(1)\n", 653 | "R_1 = np.random.normal(1.01, 0.03, 100)\n", 654 | "returns[0] = R_1\n", 655 | "assets[0] = np.cumprod(R_1)" 656 | ] 657 | }, 658 | { 659 | "cell_type": "code", 660 | "execution_count": null, 661 | "metadata": { 662 | "collapsed": true 663 | }, 664 | "outputs": [], 665 | "source": [ 666 | "returns[0]" 667 | ] 668 | }, 669 | { 670 | "cell_type": "code", 671 | "execution_count": null, 672 | "metadata": { 673 | "collapsed": true 674 | }, 675 | "outputs": [], 676 | "source": [ 677 | "np.cumprod?" 678 | ] 679 | }, 680 | { 681 | "cell_type": "code", 682 | "execution_count": null, 683 | <<<<<<< HEAD 684 | "metadata": { 685 | "collapsed": true 686 | }, 687 | ======= 688 | "metadata": {}, 689 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 690 | "outputs": [], 691 | "source": [ 692 | "a = np.array([[1, 2, 3], [4, 5, 6]])\n", 693 | "a.shape" 694 | ] 695 | }, 696 | { 697 | "cell_type": "code", 698 | "execution_count": null, 699 | <<<<<<< HEAD 700 | "metadata": { 701 | "collapsed": true 702 | }, 703 | ======= 704 | "metadata": {}, 705 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 706 | "outputs": [], 707 | "source": [ 708 | "np.cumprod(a)" 709 | ] 710 | }, 711 | { 712 | "cell_type": "markdown", 713 | "metadata": {}, 714 | "source": [ 715 | "The `random` module in NumPy is exceedingly useful. It contains methods for sampling from many different probability distributions, some of which are covered in the [random variables lecture](https://www.quantopian.com/lectures/random-variables) in the Quantopian lecture series. In this case we draw $N = 100$ random samples from a normal distribution with mean $1.01$ and standard deviation $0.03$. We treat these as the daily percentage returns of our asset and take the cumulative product of these samples to get the current price.\n", 716 | "\n", 717 | "The way we have generated our universe, the individual $R_i$ vectors are each 1-dimensional arrays and the `returns` and `assets` variables contain 2-dimensional arrays. Above, we set the initial row of both `returns` and `assets` to be the first $R_i$ vector and the cumulative asset price based on those returns, respectively.\n", 718 | "\n", 719 | "We will now use this base asset to create a few other random assets that are correlated with it." 720 | ] 721 | }, 722 | { 723 | "cell_type": "markdown", 724 | "metadata": {}, 725 | "source": [ 726 | <<<<<<< HEAD 727 | "`random`是非常有用的模块,能对不同概率分布的进行随机取样,更详细的内容会在后面的课程:[随机变量](https://www.quantopian.com/lectures/random-variables)中讲到.上面的例子,我们用期望为$1.01$,标准差为$0.03$的正态分布随机生成了100个样本,把这些值当做每只股票的日回报率(即100天每天的回报率),通过np.cumprod做累乘,算出每天的资产价格。\n", 728 | "\n", 729 | "至此,我们已经生成了我们的universe(股票池),$R_i$向量是一个一维数组,`returns` 和 `assets`是二维的\n", 730 | "数组。我们用$R_i$向量去初始化`returns` 和 `assets`每行的数据,最后得到初始的`returns` 和 `assets`。" 731 | ======= 732 | "`random`是非常有用的模块,能对不同概率分布的进行随机取样,更详细的知识可以在后面的课程:[随机变量](https://www.quantopian.com/lectures/random-variables)会讲到.上面的例子,我们用期望为$1.01$,标准差为$0.03$的正态分布随机生成了100个样本,把这些值当做每只股票的日收益率,通过np.cumprod做累乘,算出当前资产价格。\n", 733 | "\n", 734 | "至此,我们已经生成了我们的universe(证券池),$R_i$向量是一个一维的数组,`returns` 和 `assets`变量是二维的数组。我们用$R_i$向量去初始化`returns` 和 `assets`每行的数据,最后得到初始的`returns` 和 `assets`。" 735 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 736 | ] 737 | }, 738 | { 739 | "cell_type": "code", 740 | "execution_count": null, 741 | "metadata": { 742 | "collapsed": true 743 | }, 744 | "outputs": [], 745 | "source": [ 746 | "# Generate assets that are correlated with R_1\n", 747 | "np.random.seed(2)\n", 748 | "for i in range(1, N):\n", 749 | " R_i = R_1 + np.random.normal(0.001, 0.02, 100)\n", 750 | " returns[i] = R_i # Set each row of returns equal to the new R_i array\n", 751 | " assets[i] = np.cumprod(R_i)\n", 752 | " \n", 753 | "mean_returns = [(np.mean(R) - 1)*100 for R in returns]\n", 754 | "return_volatilities = [np.std(R) for R in returns]" 755 | ] 756 | }, 757 | { 758 | "cell_type": "code", 759 | "execution_count": null, 760 | <<<<<<< HEAD 761 | "metadata": { 762 | "collapsed": true 763 | }, 764 | ======= 765 | "metadata": {}, 766 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 767 | "outputs": [], 768 | "source": [ 769 | "len(mean_returns)" 770 | ] 771 | }, 772 | { 773 | "cell_type": "markdown", 774 | "metadata": {}, 775 | "source": [ 776 | "Here we generate the remaining $N - 1$ securities that we want in our universe by adding random noise to $R_1$. This ensures that our $N - 1$ other assets will be correlated with the base asset because they have some underlying information that is shared.\n", 777 | "\n", 778 | "Let's plot what the mean return of each asset looks like:" 779 | ] 780 | }, 781 | { 782 | "cell_type": "markdown", 783 | "metadata": {}, 784 | "source": [ 785 | <<<<<<< HEAD 786 | "这里,我们基于$R_1$,加上用正态分布随机生成的噪声,得到股票池中另外$N - 1$只股票的回报,这样能确保他们拥有相关性。\n", 787 | "我们用画图工具来看下每只股票回报的均值。" 788 | ======= 789 | "这里,我们基于$R_1$,加上用正态分布随机生成的噪声,得到股票池总另外$N - 1$只证券的收益,这样能确保他们拥有相关性。\n", 790 | "我们用画图工具来看下每只证券收益均值。" 791 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 792 | ] 793 | }, 794 | { 795 | "cell_type": "code", 796 | "execution_count": null, 797 | "metadata": { 798 | <<<<<<< HEAD 799 | "collapsed": true, 800 | ======= 801 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 802 | "scrolled": false 803 | }, 804 | "outputs": [], 805 | "source": [ 806 | "plt.bar(np.arange(len(mean_returns)), mean_returns)\n", 807 | "plt.xlabel('Stock')\n", 808 | "plt.ylabel('Returns')\n", 809 | "plt.title('Returns for {0} Random Assets'.format(N));" 810 | ] 811 | }, 812 | { 813 | "cell_type": "markdown", 814 | "metadata": {}, 815 | "source": [ 816 | "### Calculating Expected Return\n", 817 | "\n", 818 | "So we have a universe of stocks. Great! Now let's put them together in a portfolio and calculate its expected return and risk.\n", 819 | "\n", 820 | "We will start off by generating $N$ random weights for each asset in our portfolio." 821 | ] 822 | }, 823 | { 824 | "cell_type": "markdown", 825 | "metadata": {}, 826 | "source": [ 827 | <<<<<<< HEAD 828 | "### 计算回报期望\n", 829 | "\n", 830 | "我们现在已经有了一个股票池(universe).非常好!我们用他们生成一个投资组合,然后计算他的预期回报率和风险值。\n", 831 | ======= 832 | "### 计算收益期望\n", 833 | "\n", 834 | "我们现在已经有了一个股票池(universe).非常好!我们用他们生成一个投资组合,然后计算他的预期收益率和风险值。\n", 835 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 836 | "\n", 837 | "我们首先给我们组合中每只股票设定一个初始的随机权重值。" 838 | ] 839 | }, 840 | { 841 | "cell_type": "code", 842 | "execution_count": null, 843 | "metadata": { 844 | "collapsed": true 845 | }, 846 | "outputs": [], 847 | "source": [ 848 | "weights = np.random.uniform(0, 1, N)\n", 849 | "weights = weights/np.sum(weights)" 850 | ] 851 | }, 852 | { 853 | "cell_type": "code", 854 | "execution_count": null, 855 | <<<<<<< HEAD 856 | "metadata": { 857 | "collapsed": true 858 | }, 859 | ======= 860 | "metadata": {}, 861 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 862 | "outputs": [], 863 | "source": [ 864 | "weights.shape" 865 | ] 866 | }, 867 | { 868 | "cell_type": "markdown", 869 | "metadata": {}, 870 | "source": [ 871 | "We have to rescale the weights so that they all add up to $1$. We do this by scaling the weights vector by the sum total of all the weights. This step ensures that we will be using $100\\%$ of the portfolio's cash.\n", 872 | "\n", 873 | "To calculate the mean return of the portfolio, we have to scale each asset's return by its designated weight. We can pull each element of each array and multiply them individually, but it's quicker to use NumPy's linear algebra methods. The function that we want is `dot()`. This will calculate the dot product between two arrays for us. So if $v = \\left[ 1, 2, 3 \\right]$ and $w = \\left[4, 5, 6 \\right]$, then:\n", 874 | "\n", 875 | "$$ v \\cdot w = 1 \\times 4 + 2 \\times 5 + 3 \\times 6 $$\n", 876 | "\n", 877 | "For a one-dimensional vector, the dot product will multiply each element pointwise and add all the products together! In our case, we have a vector of weights, $\\omega = \\left[ \\omega_1, \\omega_2, \\dots \\omega_N\\right]$ and a vector of returns, $\\mu = \\left[ \\mu_1, \\mu_2, \\dots, \\mu_N\\right]$. If we take the dot product of these two we will get:\n", 878 | "\n", 879 | "$$ \\omega \\cdot \\mu = \\omega_1\\mu_1 + \\omega_2\\mu_2 + \\dots + \\omega_N\\mu_N = \\mu_P $$\n", 880 | "\n", 881 | "This yields the sum of all the asset returns scaled by their respective weights. This the the portfolio's overall expected return!" 882 | ] 883 | }, 884 | { 885 | "cell_type": "code", 886 | "execution_count": null, 887 | <<<<<<< HEAD 888 | "metadata": { 889 | "collapsed": true 890 | }, 891 | ======= 892 | "metadata": {}, 893 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 894 | "outputs": [], 895 | "source": [ 896 | "p_returns = np.dot(weights, mean_returns)\n", 897 | "print(\"Expected return of the portfolio: \", p_returns)" 898 | ] 899 | }, 900 | { 901 | "cell_type": "markdown", 902 | "metadata": {}, 903 | "source": [ 904 | <<<<<<< HEAD 905 | "这里p_returns就是资产组合平均回报。简单来说,假设组合里都是股票,每个股票都有一个权重值,当然这个不是随机的,为了方便计算,我们取得随机值,并做缩放处理处理(每项权重值/权重值之和),使得各项权重之和等于1.我们用股票的平均回报乘以股票对应的权重值,然后求和,就能得到这个资产组合的平均回报。\n", 906 | ======= 907 | "这里p_returns就是我们常说的资产组合净值。简单来说,假设组合里都是股票,每个股票都有一个权重值,当然这个不是随机的,为了方便计算,我们取得随机值,并做归一化处理(权重值/权重值之和),使得权重之和等于1.我们用权重去乘以每只股票的平均收益率,然后求和,就能得到这个资产组合的净值。\n", 908 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 909 | "\n", 910 | "我们这里使用的是`np.dot`,对于一维数组(向量),将数组中对应位置的元素进行相乘,并将结果相加。\n", 911 | "\n", 912 | "$$v = \\left[ 1, 2, 3 \\right] , \\left[4, 5, 6 \\right]:$$\n", 913 | "\n", 914 | "$$ v \\cdot w = 1 \\times 4 + 2 \\times 5 + 3 \\times 6 $$\n", 915 | "\n", 916 | <<<<<<< HEAD 917 | "在我们的例子中,权重: $\\omega = \\left[ \\omega_1, \\omega_2, \\dots \\omega_N\\right]$ 和回报: $\\mu = \\left[ \\mu_1, \\mu_2, \\dots, \\mu_N\\right]$.\n", 918 | ======= 919 | "在我们的例子中,权重: $\\omega = \\left[ \\omega_1, \\omega_2, \\dots \\omega_N\\right]$ 和收益: $\\mu = \\left[ \\mu_1, \\mu_2, \\dots, \\mu_N\\right]$.\n", 920 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 921 | "\n", 922 | "$$ \\omega \\cdot \\mu = \\omega_1\\mu_1 + \\omega_2\\mu_2 + \\dots + \\omega_N\\mu_N = \\mu_P $$" 923 | ] 924 | }, 925 | { 926 | "cell_type": "code", 927 | "execution_count": null, 928 | "metadata": { 929 | "collapsed": true 930 | }, 931 | "outputs": [], 932 | "source": [ 933 | "p_variance = np.power((np.multiply(weights, mean_returns) - p_returns), 2)\n", 934 | "p_std = np.sum(np.sqrt(p_variance))" 935 | ] 936 | }, 937 | { 938 | "cell_type": "code", 939 | "execution_count": null, 940 | <<<<<<< HEAD 941 | "metadata": { 942 | "collapsed": true 943 | }, 944 | ======= 945 | "metadata": {}, 946 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 947 | "outputs": [], 948 | "source": [ 949 | "np.sum(p_variance)" 950 | ] 951 | }, 952 | { 953 | "cell_type": "code", 954 | "execution_count": null, 955 | <<<<<<< HEAD 956 | "metadata": { 957 | "collapsed": true 958 | }, 959 | ======= 960 | "metadata": {}, 961 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 962 | "outputs": [], 963 | "source": [ 964 | "p_std" 965 | ] 966 | }, 967 | { 968 | "cell_type": "markdown", 969 | "metadata": {}, 970 | "source": [ 971 | "Calculating the mean return is fairly intuitive and does not require too much explanation of linear algebra. However, calculating the variance of our portfolio requires a bit more background." 972 | ] 973 | }, 974 | { 975 | "cell_type": "markdown", 976 | "metadata": {}, 977 | "source": [ 978 | <<<<<<< HEAD 979 | "计算平均回报很直观,然而,计算投资组合的回报方差,这个时候就需要线性代数来帮忙了。" 980 | ======= 981 | "计算收益的均值很直观,并不需要线性代数的相关知识,然而,计算我们投资组合的方差就需要一些背景知识了。" 982 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 983 | ] 984 | }, 985 | { 986 | "cell_type": "markdown", 987 | "metadata": {}, 988 | "source": [ 989 | "#### Beware of NaN values\n", 990 | "\n", 991 | "Most of the time, all of these calculations will work without an issue. However, when working with real data we run the risk of having `nan` values in our arrays. This is NumPy's way of saying that the data there is missing or doesn't exist. These `nan` values can lead to errors in mathematical calculations so it is important to be aware of whether your array contains `nan` values and to know how to drop them." 992 | ] 993 | }, 994 | { 995 | "cell_type": "markdown", 996 | "metadata": {}, 997 | "source": [ 998 | "#### 对NaN值要小心\n", 999 | "\n", 1000 | <<<<<<< HEAD 1001 | "多数候进行数值计算并不会有问题。但当NumPy遇到了缺省值时,就会用`nan`来表示,当做数值计算的时候,这个会带来麻烦,所以在进行计算的时候,需要消除他们带来的影响。" 1002 | ======= 1003 | "大多数时候计算并不会有问题。但当我们的numpy遇到了缺省值时,就会用`nan`来表示,当我们做数值计算的时候,这个会带来麻烦,所以在进行计算的时候,消除他们带来的影响。" 1004 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1005 | ] 1006 | }, 1007 | { 1008 | "cell_type": "code", 1009 | "execution_count": null, 1010 | <<<<<<< HEAD 1011 | "metadata": { 1012 | "collapsed": true 1013 | }, 1014 | ======= 1015 | "metadata": {}, 1016 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1017 | "outputs": [], 1018 | "source": [ 1019 | "v = np.array([1, 2, np.nan, 4, 5])\n", 1020 | "print(v)" 1021 | ] 1022 | }, 1023 | { 1024 | "cell_type": "markdown", 1025 | "metadata": {}, 1026 | "source": [ 1027 | "Let's see what happens when we try to take the mean of this array." 1028 | ] 1029 | }, 1030 | { 1031 | "cell_type": "markdown", 1032 | "metadata": {}, 1033 | "source": [ 1034 | "如有`nan`时,我们计算均值时会发生什么?" 1035 | ] 1036 | }, 1037 | { 1038 | "cell_type": "code", 1039 | "execution_count": null, 1040 | <<<<<<< HEAD 1041 | "metadata": { 1042 | "collapsed": true 1043 | }, 1044 | ======= 1045 | "metadata": {}, 1046 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1047 | "outputs": [], 1048 | "source": [ 1049 | "print(np.mean(v))" 1050 | ] 1051 | }, 1052 | { 1053 | "cell_type": "markdown", 1054 | "metadata": {}, 1055 | "source": [ 1056 | "Clearly, `nan` values can have a large impact on our calculations. Fortunately, we can check for `nan` values with the `isnan()` function." 1057 | ] 1058 | }, 1059 | { 1060 | "cell_type": "markdown", 1061 | "metadata": {}, 1062 | "source": [ 1063 | "这下,你知道`nan`会对我们计算有多大影响了吧!\n", 1064 | "还好我们有检测函数,np.isnan()" 1065 | ] 1066 | }, 1067 | { 1068 | "cell_type": "code", 1069 | "execution_count": null, 1070 | <<<<<<< HEAD 1071 | "metadata": { 1072 | "collapsed": true 1073 | }, 1074 | ======= 1075 | "metadata": {}, 1076 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1077 | "outputs": [], 1078 | "source": [ 1079 | "np.isnan(v)" 1080 | ] 1081 | }, 1082 | { 1083 | "cell_type": "markdown", 1084 | "metadata": {}, 1085 | "source": [ 1086 | "Calling `isnan()` on an array will call the function on each value of the array, returning a value of `True` if the element is `nan` and `False` if the element is valid. Now, knowing whether your array contains `nan` values is all well and good, but how do we remove `nan`s? Handily enough, NumPy arrays can be indexed by boolean values (`True` or `False`). If we use a boolean array to index an array, we will remove all values of the array that register as `False` under the condition. We use the `isnan()` function in create a boolean array, assigning a `True` value to everything that is *not* `nan` and a `False` to the `nan`s and we use that to index the same array." 1087 | ] 1088 | }, 1089 | { 1090 | "cell_type": "markdown", 1091 | "metadata": {}, 1092 | "source": [ 1093 | "np.isnan()吃掉数组,吐出的是`True`和`False`,如果元素是`nan`,吐出`True`,反之就是`False`\n", 1094 | ",然后利用索引就可以把不是`nan`的值给筛选出来了。" 1095 | ] 1096 | }, 1097 | { 1098 | "cell_type": "code", 1099 | "execution_count": null, 1100 | <<<<<<< HEAD 1101 | "metadata": { 1102 | "collapsed": true 1103 | }, 1104 | ======= 1105 | "metadata": {}, 1106 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1107 | "outputs": [], 1108 | "source": [ 1109 | "ix = ~np.isnan(v) # the ~ indicates a logical not, inverting the bools\n", 1110 | "print(v[ix]) # We can also just write v = v[~np.isnan(v)]" 1111 | ] 1112 | }, 1113 | { 1114 | "cell_type": "code", 1115 | "execution_count": null, 1116 | <<<<<<< HEAD 1117 | "metadata": { 1118 | "collapsed": true 1119 | }, 1120 | ======= 1121 | "metadata": {}, 1122 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1123 | "outputs": [], 1124 | "source": [ 1125 | "print(np.mean(v[ix]))" 1126 | ] 1127 | }, 1128 | { 1129 | "cell_type": "markdown", 1130 | "metadata": {}, 1131 | "source": [ 1132 | "There are a few shortcuts to this process in the form of NumPy functions specifically built to handle them, such as `nanmean()`." 1133 | ] 1134 | }, 1135 | { 1136 | "cell_type": "markdown", 1137 | "metadata": {}, 1138 | "source": [ 1139 | "也可以直接使用内置函数`nanmean()`实现上述的功能。" 1140 | ] 1141 | }, 1142 | { 1143 | "cell_type": "code", 1144 | "execution_count": null, 1145 | <<<<<<< HEAD 1146 | "metadata": { 1147 | "collapsed": true 1148 | }, 1149 | ======= 1150 | "metadata": {}, 1151 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1152 | "outputs": [], 1153 | "source": [ 1154 | "print(np.nanmean(v))" 1155 | ] 1156 | }, 1157 | { 1158 | "cell_type": "markdown", 1159 | "metadata": {}, 1160 | "source": [ 1161 | "The `nanmean()` function simply calculates the mean of the array as if there were no `nan` values at all! There are a few more of these functions, so feel free to read more about them in the [documentation](https://docs.scipy.org/doc/numpy/user/index.html). These indeterminate values are more an issue with data than linear algebra itself so it is helpful that there are ways to handle them." 1162 | ] 1163 | }, 1164 | { 1165 | "cell_type": "markdown", 1166 | "metadata": {}, 1167 | "source": [ 1168 | <<<<<<< HEAD 1169 | "`nanmean()` 轻松实现了不含`nan`的均值的计算, [点我查看Numpy的使用指南](https://docs.scipy.org/doc/numpy/user/index.html). 我们在进行线性代数计算之前,首先要处理好诸如缺省值这些数据问题。" 1170 | ======= 1171 | "`nanmean()` 轻松实现了不含`nan`的均值的计算, [点我查看Numpy的使用指南](https://docs.scipy.org/doc/numpy/user/index.html). 我们在进行线性代数计算之前,首先处理好诸如缺省值这些数据问题。" 1172 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1173 | ] 1174 | }, 1175 | { 1176 | "cell_type": "markdown", 1177 | "metadata": {}, 1178 | "source": [ 1179 | "### Conclusion\n", 1180 | "\n", 1181 | "Linear algebra is pervasive in finance and in general. For example, the calculation of *optimal* weights according to modern portfolio theory is done using linear algebra techniques. The arrays and functions in NumPy allow us to handle these calculations in an intuitive way. For a quick intro to linear algebra and how to use NumPy to do more significant matrix calculations, proceed to the next section." 1182 | ] 1183 | }, 1184 | { 1185 | "cell_type": "markdown", 1186 | "metadata": {}, 1187 | "source": [ 1188 | "### 小结\n", 1189 | "\n", 1190 | <<<<<<< HEAD 1191 | "线代在金融里的应用是无处不在的。`modern portfolio`理论的权重值的*最优化(optimal)*问题就需要用到线性代数。\n", 1192 | ======= 1193 | "线代在金融里是无处不在的。线代资产组合理论的权重值的*optimal*问题就需要用到线性代数。\n", 1194 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1195 | "下面就让我们开启线代之旅吧。" 1196 | ] 1197 | }, 1198 | { 1199 | "cell_type": "markdown", 1200 | "metadata": {}, 1201 | "source": [ 1202 | "## A brief foray into linear algebra\n", 1203 | "\n", 1204 | "Let's start with a basic overview of some linear algebra. Linear algebra comes down to the mutiplication and composition of scalar and matrix values. A scalar value is just a real number that we multiply against an array. When we scale a matrix or array using a scalar, we multiply each individual element of that matrix or array by the scalar.\n", 1205 | "\n", 1206 | "A matrix is a collection of values, typically represented by an $m \\times n$ grid, where $m$ is the number of rows and $n$ is the number of columns. The edge lengths $m$ and $n$ do not necessarily have to be different. If we have $m = n$, we call this a square matrix. A particularly interesting case of a matrix is when $m = 1$ or $n = 1$. In this case we have a special case of a matrix that we call a vector. While there is a matrix object in NumPy we will be doing everything using NumPy arrays because they can have dimensions greater than $2$. For the purpose of this section, we will be using matrix and array interchangeably.\n", 1207 | "\n", 1208 | "We can express the matrix equation as:\n", 1209 | "\n", 1210 | "$$ y = A\\cdot x $$\n", 1211 | "\n", 1212 | "Where $A$ is an $m \\times n$ matrix, $y$ is a $m \\times 1$ vector, and $x$ is a $n \\times 1$ vector. On the right-hand side of the equation we are multiplying a matrix by a vector. This requires a little bit more clarification, lest we think that we can go about multiplying any matrices by any other matrices.\n", 1213 | "\n", 1214 | "#### Matrix multiplication\n", 1215 | "\n", 1216 | "With matrix multiplication, the order in which the matrices are multiplied matters. Multiplying a matrix on the left side by another matrix may be just fine, but multiplying on the right may be undefined." 1217 | ] 1218 | }, 1219 | { 1220 | "cell_type": "markdown", 1221 | "metadata": {}, 1222 | "source": [ 1223 | "## 线性代数简介\n", 1224 | "\n", 1225 | "让我们来回顾下线性代数的一些基本概念。我们先来看看线性代数中标量和矩阵的乘法。一个标量就是一个实数,当我们用一个标量乘以一个矩阵的时候,就是将矩阵中每个元素都乘以这个标量,从而实现了矩阵的缩放。\n", 1226 | "\n", 1227 | <<<<<<< HEAD 1228 | "矩阵就是数值的集合,通常我们会将其展示成$m \\times n$表格的形式,$m$代表矩阵的行数,$n$代表矩阵的列数。$m$和$n$并不要求相等,当他们相等时,我们称这个矩阵为`方阵`。对于$m = 1$ 或者 $n = 1$的矩阵,这种特殊的矩阵,有时也称作向量。在`Numpy`中我们用多维(维度大于等于2)数组表示一个矩阵对象,在本节课中,我们视情况使用矩阵或者数组这两种称谓。\n", 1229 | ======= 1230 | "矩阵就是数值的集合,通常我们会将其展示成$m \\times n$表格的形式,$m$代表矩阵的行数,$n$代表矩阵的列数。$m$和$n$并不要求相等,当他们相等时,我们称这个矩阵为`方阵`。对于$m = 1$ 或者 $n = 1$的矩阵,这种特殊的矩阵,有时也称作向量。在`Numpy`中我们用多维(维度大于等于2)数组表示一个矩阵对象,在本节课中,我们视情况使用矩阵或者数组。\n", 1231 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1232 | "\n", 1233 | "一个矩阵等式如下:\n", 1234 | "\n", 1235 | "$$ y = A\\cdot x $$\n", 1236 | "\n", 1237 | "A是一个$m \\times n$ 的矩阵, $y$ 是一个 $m \\times 1$ 向量, and $x$ 是一个 $n \\times 1$ 向量.所以等式的右边是一个矩阵乘以一个向量,需要注意的是,矩阵乘法,对于矩阵的形状有严格要求。\n", 1238 | "\n", 1239 | "#### 矩阵乘法\n", 1240 | "\n", 1241 | "故在矩阵乘法中,两个矩阵的顺序很重要,$$ A\\cdot x $$可以成立,但是$$ x\\cdot A $$就不一定。(矩阵乘法不满足交换律)" 1242 | ] 1243 | }, 1244 | { 1245 | "cell_type": "code", 1246 | "execution_count": null, 1247 | "metadata": { 1248 | "collapsed": true 1249 | }, 1250 | "outputs": [], 1251 | "source": [ 1252 | "A = np.array([\n", 1253 | " [1, 2, 3, 12, 6],\n", 1254 | " [4, 5, 6, 15, 20],\n", 1255 | " [7, 8, 9, 10, 10] \n", 1256 | " ])\n", 1257 | "B = np.array([\n", 1258 | " [4, 4, 2],\n", 1259 | " [2, 3, 1],\n", 1260 | " [6, 5, 8],\n", 1261 | " [9, 9, 9]\n", 1262 | " ])" 1263 | ] 1264 | }, 1265 | { 1266 | "cell_type": "code", 1267 | "execution_count": null, 1268 | <<<<<<< HEAD 1269 | "metadata": { 1270 | "collapsed": true 1271 | }, 1272 | ======= 1273 | "metadata": {}, 1274 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1275 | "outputs": [], 1276 | "source": [ 1277 | "print(A.shape)\n", 1278 | "print(B.shape)" 1279 | ] 1280 | }, 1281 | { 1282 | "cell_type": "markdown", 1283 | "metadata": {}, 1284 | "source": [ 1285 | "Notice that the above-defined matrices, $A$ and $B$, have different dimensions. $A$ is $3 \\times 5$ and $B$ is $4 \\times 3$. The general rule of what can and cannot be multiplied in which order is based on the dimensions of the matrices. Specifically, the number of columns in the matrix on the left must be equal to the number of rows in the matrix on the right. In super informal terms, let's say that we have an $m \\times n$ matrix and a $p \\times q$ matrix. If we multiply the first by the second on the right, we get the following:\n", 1286 | "\n", 1287 | "$$ (m \\times n) \\cdot (p \\times q) = (m \\times q) $$\n", 1288 | "\n", 1289 | "So the resultant product has the same number of rows as the left matrix and the same number of columns as the right matrix. This limitation of matrix multiplication with regards to dimensions is important to keep track of when writing code. To demonstrate this, we use the `dot()` function to multiply our matrices below:" 1290 | ] 1291 | }, 1292 | { 1293 | "cell_type": "markdown", 1294 | "metadata": {}, 1295 | "source": [ 1296 | <<<<<<< HEAD 1297 | "注意矩阵 $A$ 和 $B$的维度。 $A$ 是 $3 \\times 5$, 而 $B$ 是 $4 \\times 3$,这种情况是无法进行矩阵相乘的,矩阵乘法规则是乘号左边的矩阵列数要和乘号右边的矩阵的行数相等才可以进行相乘。用数学的语言可以表示为:形如$m \\times n$ 和 $p \\times q$ 两个矩阵:\n", 1298 | ======= 1299 | "注意矩阵 $A$ 和 $B$的维度。 $A$ 是 $3 \\times 5$ 而 $B$ 是 $4 \\times 3$,这种情况是无法进行矩阵相乘的,矩阵乘法规则是乘号左边的矩阵列数要和乘号右边的矩阵的行数相等才可以进行相乘。用数学的语言可以表示为:形如$m \\times n$ 和 $p \\times q$ 两个矩阵:\n", 1300 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1301 | "\n", 1302 | "$$ (m \\times n) \\cdot (p \\times q) = (m \\times q) $$\n", 1303 | "\n", 1304 | "可以看出两个矩阵相乘后的新矩阵,他的行数等于乘号左边矩阵行数,列数等于乘号右边的矩阵列数。在我们进行编程时,要特别留意矩阵的维度。我们通常使用`np.dot()`来进行矩阵乘法的计算。" 1305 | ] 1306 | }, 1307 | { 1308 | "cell_type": "code", 1309 | "execution_count": null, 1310 | <<<<<<< HEAD 1311 | "metadata": { 1312 | "collapsed": true 1313 | }, 1314 | ======= 1315 | "metadata": {}, 1316 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1317 | "outputs": [], 1318 | "source": [ 1319 | "print(np.dot(B, A))\n", 1320 | "print(np.dot(B, A).shape)" 1321 | ] 1322 | }, 1323 | { 1324 | "cell_type": "markdown", 1325 | "metadata": {}, 1326 | "source": [ 1327 | "These results make sense in accordance with our rule. Multiplying a $3 \\times 5$ matrix on the right by a $4 \\times 3$ matrix results in an error while multiplying a $4 \\times 3$ matrix on the right by a $3 \\times 5$ matrix results in a $4 \\times 5$ matrix." 1328 | ] 1329 | }, 1330 | { 1331 | "cell_type": "markdown", 1332 | "metadata": {}, 1333 | "source": [ 1334 | "计算的结果验证了我们的法则:\n", 1335 | " \n", 1336 | "$$ (4 \\times 5) \\cdot (5 \\times 3) = (4 \\times 3) $$" 1337 | ] 1338 | }, 1339 | { 1340 | "cell_type": "code", 1341 | "execution_count": null, 1342 | <<<<<<< HEAD 1343 | "metadata": { 1344 | "collapsed": true 1345 | }, 1346 | ======= 1347 | "metadata": {}, 1348 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1349 | "outputs": [], 1350 | "source": [ 1351 | "print(np.dot(A.T, B.T))\n", 1352 | "print(np.dot(A.T, B.T).shape)" 1353 | ] 1354 | }, 1355 | { 1356 | "cell_type": "markdown", 1357 | "metadata": {}, 1358 | "source": [ 1359 | "### Portfolio Variance\n", 1360 | "\n", 1361 | "Let's return to our portfolio example from before. We calculated the expected return of the portfolio, but how do we calculate the variance We start by trying to evaluate the portfolio as a sum of each individual asset, scaled by it's weight.\n", 1362 | "\n", 1363 | "$$ VAR[P] = VAR[\\omega_1 S_1 + \\omega_2 S_2 + \\cdots + \\omega_N S_N] $$\n", 1364 | "\n", 1365 | "Where $S_0, \\cdots, S_N$ are the assets contained within our universe. If all of our assets were independent of each other, we could simply evaluate this as\n", 1366 | "\n", 1367 | "$$ VAR[P] = VAR[\\omega_1 S_1] + VAR[\\omega_2 S_2] + \\cdots + VAR[\\omega_N S_N] = \\omega_1^2\\sigma_1^2 + \\omega_2^2\\sigma_2^2 + \\cdots + \\omega_N^2\\sigma_N^2 $$\n", 1368 | "\n", 1369 | "However, all of our assets depend on each other by their construction. They are all in some way related to our base asset and therefore each other. We thus have to calculate the variance of the portfolio by including the individual pairwise covariances of each asset. Our formula for the variance of the portfolio:\n", 1370 | "\n", 1371 | "$$ VAR[P] = \\sigma_P^2 = \\sum_i \\omega_i^2\\sigma_i^2 + \\sum_i\\sum_{i\\neq j} \\omega_i\\omega_j\\sigma_i\\sigma_j\\rho_{i, j}, \\ i, j \\in \\lbrace 1, 2, \\cdots, N \\rbrace $$\n", 1372 | "\n", 1373 | "Where $\\rho_{i,j}$ is the correlation between $S_i$ and $S_j$, $\\rho_{i, j} = \\frac{COV[S_i, S_j]}{\\sigma_i\\sigma_j}$. This seems exceedingly complicated, but we can easily handle all of this using NumPy arrays. First, we calculate the covariance matrix that relates all the individual stocks in our universe." 1374 | ] 1375 | }, 1376 | { 1377 | "cell_type": "markdown", 1378 | "metadata": {}, 1379 | "source": [ 1380 | <<<<<<< HEAD 1381 | "### 组合回报的方差\n", 1382 | "\n", 1383 | "Let's return to our portfolio example from before. We calculated the expected return of the portfolio, but how do we calculate the variance We start by trying to evaluate the portfolio as a sum of each individual asset, scaled by it's weight.\n", 1384 | "\n", 1385 | "让我们回到刚才那个投资组合的例子。我们计算了组合投资回报的期望值(均值),我们应该如何计算这个组合的回报的方差呢?\n", 1386 | ======= 1387 | "### 组合收益的方差\n", 1388 | "\n", 1389 | "Let's return to our portfolio example from before. We calculated the expected return of the portfolio, but how do we calculate the variance We start by trying to evaluate the portfolio as a sum of each individual asset, scaled by it's weight.\n", 1390 | "\n", 1391 | "让我们回到刚才那个投资组合的例子。我们计算了组合投资收益的期望值(均值),我们应该如何计算这个组合的收益的方差呢?\n", 1392 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1393 | "\n", 1394 | "$$ VAR[P] = VAR[\\omega_1 S_1 + \\omega_2 S_2 + \\cdots + \\omega_N S_N] $$\n", 1395 | "\n", 1396 | "$S_0, \\cdots, S_N$ 代表了我们组合中每个投资标的物的资产. 如果所有的投资标的都是相互独立的,那么我们可以使用如下的公式进行计算:\n", 1397 | "\n", 1398 | "$$ VAR[P] = VAR[\\omega_1 S_1] + VAR[\\omega_2 S_2] + \\cdots + VAR[\\omega_N S_N] = \\omega_1^2\\sigma_1^2 + \\omega_2^2\\sigma_2^2 + \\cdots + \\omega_N^2\\sigma_N^2 $$\n", 1399 | "\n", 1400 | "然而,我们的标的物都是彼此相关的,因为他们由$R_1$加上一个随机值生成(这些随机值都服从同一个正态分布)。因此,我们就要计算各个标的物之间的协方差,我们的计算公式就变成了:\n", 1401 | "\n", 1402 | "$$ VAR[P] = \\sigma_P^2 = \\sum_i \\omega_i^2\\sigma_i^2 + \\sum_i\\sum_{i\\neq j} \\omega_i\\omega_j\\sigma_i\\sigma_j\\rho_{i, j}, \\ i, j \\in \\lbrace 1, 2, \\cdots, N \\rbrace $$\n", 1403 | "\n", 1404 | "$\\rho_{i,j}$ 为$S_i$ 和 $S_j$的相关系数: $\\rho_{i, j} = \\frac{COV[S_i, S_j]}{\\sigma_i\\sigma_j}$.\n", 1405 | <<<<<<< HEAD 1406 | "这个公式看起来极其复杂,但是我们通过使用NumPy数组可以轻松实现。首先,我们计算各个标的物的之间的协方差矩阵。" 1407 | ======= 1408 | "这个公式看起来极其复杂,但是我们通过使用Numpy数组可以轻松实现。首先,我们计算各个标的物的之间的协方差矩阵。" 1409 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1410 | ] 1411 | }, 1412 | { 1413 | "cell_type": "code", 1414 | "execution_count": null, 1415 | <<<<<<< HEAD 1416 | "metadata": { 1417 | "collapsed": true 1418 | }, 1419 | ======= 1420 | "metadata": {}, 1421 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1422 | "outputs": [], 1423 | "source": [ 1424 | "cov_mat = np.cov(returns)\n", 1425 | "print(cov_mat.shape)\n", 1426 | "print(cov_mat)" 1427 | ] 1428 | }, 1429 | { 1430 | "cell_type": "code", 1431 | "execution_count": null, 1432 | <<<<<<< HEAD 1433 | "metadata": { 1434 | "collapsed": true 1435 | }, 1436 | ======= 1437 | "metadata": {}, 1438 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1439 | "outputs": [], 1440 | "source": [ 1441 | "np.sum(cov_mat.diagonal())" 1442 | ] 1443 | }, 1444 | { 1445 | "cell_type": "markdown", 1446 | "metadata": {}, 1447 | "source": [ 1448 | "This array is not formatted particularly nicely, but a covariance matrix is a very important concept. The covariance matrix is of the form:\n", 1449 | "\n", 1450 | "$$ \\left[\\begin{matrix}\n", 1451 | "VAR[S_1] & COV[S_1, S_2] & \\cdots & COV[S_1, S_N] \\\\\n", 1452 | "COV[S_2, S_1] & VAR[S_2] & \\cdots & COV[S_2, S_N] \\\\\n", 1453 | "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n", 1454 | "COV[S_N, S_1] & COV[S_N, S_2] & \\cdots & VAR[S_N]\n", 1455 | "\\end{matrix}\\right] $$\n", 1456 | "\n", 1457 | "So each diagonal entry is the variance of that asset at that index and each off-diagonal holds the covariance of two assets indexed by the column and row number. What is important is that once we have the covariance matrix we are able to do some very quick linear algebra to calculate the variance of the overall portfolio. We can represent the variance of the portfolio in array form as:\n", 1458 | "\n", 1459 | "$$ \\sigma_p^2 = \\omega \\ C \\ \\omega^\\intercal$$\n", 1460 | "\n", 1461 | "Where $C$ is the covariance matrix of all the assets and $\\omega$ is the array containing the weights of each individual asset. The superscript $\\intercal$ on the second $\\omega$ listed above denotes the **transpose** of $\\omega$. For a reference on the evaluation of the variance of a portfolio as a matrix equation, please see the Wikipedia article on [modern portfolio theory](https://en.wikipedia.org/wiki/Modern_portfolio_theory).\n", 1462 | "\n", 1463 | "The transpose of an array is what you get when you switch the rows and columns of an array. This has the effect of reflecting an array across what you might imagine as a diagonal. For example, take our array $A$ from before:" 1464 | ] 1465 | }, 1466 | { 1467 | "cell_type": "markdown", 1468 | "metadata": { 1469 | "collapsed": true 1470 | }, 1471 | "source": [ 1472 | "协方差矩阵是一个非常重要的概念. 它的形式如下:\n", 1473 | "\n", 1474 | "$$ \\left[\\begin{matrix}\n", 1475 | "VAR[S_1] & COV[S_1, S_2] & \\cdots & COV[S_1, S_N] \\\\\n", 1476 | "COV[S_2, S_1] & VAR[S_2] & \\cdots & COV[S_2, S_N] \\\\\n", 1477 | "\\vdots & \\vdots & \\ddots & \\vdots \\\\\n", 1478 | "COV[S_N, S_1] & COV[S_N, S_2] & \\cdots & VAR[S_N]\n", 1479 | "\\end{matrix}\\right] $$\n", 1480 | "\n", 1481 | <<<<<<< HEAD 1482 | "其主对角线的元素对应是每个标的物(股票)回报的方差,非主对角线的元素用行和列的序号来表示对应的两个标的物回报之间的协方差。值得注意的是:当我们拿到协方差矩阵后,可以通过线性代数的方法快速计算出投资组合回报的方差,计算公式为:\n", 1483 | "\n", 1484 | "$$ \\sigma_p^2 = \\omega \\ C \\ \\omega^\\intercal$$\n", 1485 | "\n", 1486 | "其中$C$是协方差矩阵,$\\omega$是股票权重值的矩阵,$\\omega^\\intercal$是$\\omega$的转置矩阵,关于投资组合回报的方差计算公式,可以参考wiki百科[modern portfolio theory](https://en.wikipedia.org/wiki/Modern_portfolio_theory).\n", 1487 | ======= 1488 | "其主对角线的元素对应是每个标的物(证券)收益的方差,非主对角线的元素用行和列的序号来表示对应的两个标的物收益之间的协方差。至关重要的一点就是:当我们拿到协方差矩阵后,可以通过线性代数的方法快速计算出投资组合收益的方差,计算公式为:\n", 1489 | "\n", 1490 | "$$ \\sigma_p^2 = \\omega \\ C \\ \\omega^\\intercal$$\n", 1491 | "\n", 1492 | "其中$C$是协方差矩阵,$\\omega$是证券的权重值,$\\omega^\\intercal$是$\\omega$的转置矩阵,关于投资组合收益的方差计算所用的矩阵表达式,可以参考wiki百科[modern portfolio theory](https://en.wikipedia.org/wiki/Modern_portfolio_theory).\n", 1493 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1494 | "\n", 1495 | "转置矩阵就是将行列互换得到的新矩阵,好比是沿着对角线做了镜像反转。我们来举个栗子好了:\n", 1496 | "对于矩阵$A$:" 1497 | ] 1498 | }, 1499 | { 1500 | "cell_type": "code", 1501 | "execution_count": null, 1502 | <<<<<<< HEAD 1503 | "metadata": { 1504 | "collapsed": true 1505 | }, 1506 | ======= 1507 | "metadata": {}, 1508 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1509 | "outputs": [], 1510 | "source": [ 1511 | "print(A)" 1512 | ] 1513 | }, 1514 | { 1515 | "cell_type": "markdown", 1516 | "metadata": {}, 1517 | "source": [ 1518 | "The transpose looks like a mirror image of the same array." 1519 | ] 1520 | }, 1521 | { 1522 | "cell_type": "markdown", 1523 | "metadata": {}, 1524 | "source": [ 1525 | "转置后矩阵如同镜像翻转了一般。" 1526 | ] 1527 | }, 1528 | { 1529 | "cell_type": "code", 1530 | "execution_count": null, 1531 | <<<<<<< HEAD 1532 | "metadata": { 1533 | "collapsed": true 1534 | }, 1535 | ======= 1536 | "metadata": {}, 1537 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1538 | "outputs": [], 1539 | "source": [ 1540 | "print(np.transpose(A))" 1541 | ] 1542 | }, 1543 | { 1544 | "cell_type": "markdown", 1545 | "metadata": {}, 1546 | "source": [ 1547 | "But $\\omega$ here is a 1-dimensional array, a vector! It makes perfect to take the transpose of $A$, a $3 \\times 5$ array, as the output will be a $5 \\times 3$ array, but a 1-dimensional array is not quite as intuitive. A typical 1-dimensional array can be thought of as a $1 \\times n$ horizontal vector. Thus, taking the tranpose of this array essentially means changing it into a $n \\times 1$ vertical vector. This makes sense because 1-dimensional arrays are still arrays and any multiplication done between 1-dimensional and higher dimensional arrays must keep in line with our dimensionality issue of matrix multiplication.\n", 1548 | "\n", 1549 | "To make a long story short, we think of $\\omega$ as $1 \\times N$ since we have $N$ securities. This makes it so that $\\omega^\\intercal$ is $N \\times 1$. Again, our covariance matrix is $N \\times N$. So the overall multiplication works out like so, in informal terms:\n", 1550 | "\n", 1551 | "$$ \\text{Dimensions}(\\sigma_p^2) = \\text{Dimensions}(\\omega C \\omega^\\intercal) = (1 \\times N)\\cdot (N \\times N)\\cdot (N \\times 1) = (1 \\times 1)$$\n", 1552 | "\n", 1553 | "Multiplying the covariance matrix on the left by the plain horizontal vector and on the right by that vector's transpose results in the calculation of a single scalar ($1 \\times 1$) value, our portfolio's variance.\n", 1554 | "\n", 1555 | "So knowing this, let's proceed and calculate the portfolio variance! We can easily calculate the product of these arrays by using `dot()` for matrix multiplication, though this time we have to do it twice." 1556 | ] 1557 | }, 1558 | { 1559 | "cell_type": "markdown", 1560 | "metadata": {}, 1561 | "source": [ 1562 | "可 $\\omega$ 是一个一维数组,一个向量! 当我们对$A$进行转置操作时,它从 $3 \\times 5$ 的数组就变成了 $5 \\times 3$ 的数组,但是一维数组就没这么直观。通常我们把一维数组想象成一个 $1 \\times n$ 水平向量,当对它进行转置操作后,就变成一个 $n \\times 1$ 的垂直向量。这样就能保证当一维数组和更高纬度数组做矩阵乘法的时候,能遵守矩阵乘法的规则。\n", 1563 | "\n", 1564 | <<<<<<< HEAD 1565 | "简单来说,当我们有 $N$ 只股票时,我们把$\\omega$ 看做一个 $1 \\times N$的矩阵, $\\omega^\\intercal$ 的`shape`就是 $N \\times 1$.我们协方差矩阵的`shape`是 $N \\times N$。所以当进行完所有的矩阵乘法运算后:\n", 1566 | "\n", 1567 | "$$ \\text{Dimensions}(\\sigma_p^2) = \\text{Dimensions}(\\omega C \\omega^\\intercal) = (1 \\times N)\\cdot (N \\times N)\\cdot (N \\times 1) = (1 \\times 1)$$\n", 1568 | "\n", 1569 | "水平向量乘以协方差矩阵再乘以垂直向量后,结果就是一个标量($1 \\times 1$) -- 组合回报的方差。\n", 1570 | "\n", 1571 | "到这里,就是计算组合回报方差全过程!\n", 1572 | "\n", 1573 | "接下来我们只用使用两次`np.dot()`进行矩阵乘法运算,我们就大功告成了。" 1574 | ======= 1575 | "为了长话短说,当我们有 $N$ 只证券时,我们把$\\omega$ 看做一个 $1 \\times N$的矩阵, $\\omega^\\intercal$ 的`shape`就是 $N \\times 1$.我们协方差矩阵的`shape`是 $N \\times N$。所以当进行完所有的矩阵乘法运算后:\n", 1576 | "\n", 1577 | "$$ \\text{Dimensions}(\\sigma_p^2) = \\text{Dimensions}(\\omega C \\omega^\\intercal) = (1 \\times N)\\cdot (N \\times N)\\cdot (N \\times 1) = (1 \\times 1)$$\n", 1578 | "\n", 1579 | "水平向量乘以协方差矩阵再乘以垂直向量后,结果就是一个标量($1 \\times 1$) -- 组合收益的方差。\n", 1580 | "\n", 1581 | "到这里,就是计算组合收益方差全过程!接下来我们只用使用两次`np.dot()`进行矩阵乘法运算,我们就大功告成了。" 1582 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1583 | ] 1584 | }, 1585 | { 1586 | "cell_type": "code", 1587 | "execution_count": null, 1588 | <<<<<<< HEAD 1589 | "metadata": { 1590 | "collapsed": true 1591 | }, 1592 | ======= 1593 | "metadata": {}, 1594 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1595 | "outputs": [], 1596 | "source": [ 1597 | "print(\"shape of weights:\", weights.shape)\n", 1598 | "print(\"shape of covariance:\", cov_mat.shape)" 1599 | ] 1600 | }, 1601 | { 1602 | "cell_type": "code", 1603 | "execution_count": null, 1604 | <<<<<<< HEAD 1605 | "metadata": { 1606 | "collapsed": true 1607 | }, 1608 | ======= 1609 | "metadata": {}, 1610 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1611 | "outputs": [], 1612 | "source": [ 1613 | "p_var = np.dot(np.dot(weights.T, cov_mat), weights)\n", 1614 | "p_std_cov = np.sqrt(p_var)\n", 1615 | "print(\"the portfolio std is:\", p_std_cov)" 1616 | ] 1617 | }, 1618 | { 1619 | "cell_type": "code", 1620 | "execution_count": null, 1621 | "metadata": { 1622 | <<<<<<< HEAD 1623 | "collapsed": true, 1624 | ======= 1625 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1626 | "scrolled": false 1627 | }, 1628 | "outputs": [], 1629 | "source": [ 1630 | "# Calculating the portfolio volatility(投资组合的波动率)\n", 1631 | "var_p = np.dot(np.dot(weights, cov_mat), weights.T)\n", 1632 | "vol_p = np.sqrt(var_p)\n", 1633 | "print(\"Portfolio volatility: \", vol_p)" 1634 | ] 1635 | }, 1636 | { 1637 | "cell_type": "markdown", 1638 | "metadata": {}, 1639 | "source": [ 1640 | "To confirm this calculation, let's simply evaluate the volatility of the portfolio using only NumPy functions." 1641 | ] 1642 | }, 1643 | { 1644 | "cell_type": "markdown", 1645 | "metadata": {}, 1646 | "source": [ 1647 | "为了验证计算结果,我们使用NumPy内置函数来计算投资组合的波动率。" 1648 | ] 1649 | }, 1650 | { 1651 | "cell_type": "code", 1652 | "execution_count": null, 1653 | <<<<<<< HEAD 1654 | "metadata": { 1655 | "collapsed": true 1656 | }, 1657 | ======= 1658 | "metadata": {}, 1659 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1660 | "outputs": [], 1661 | "source": [ 1662 | "# Confirming calculation\n", 1663 | "vol_p_alt = np.sqrt(np.var(np.dot(weights, returns), ddof=1))\n", 1664 | "print(\"Portfolio volatility: \", vol_p_alt)" 1665 | ] 1666 | }, 1667 | { 1668 | "cell_type": "markdown", 1669 | "metadata": {}, 1670 | "source": [ 1671 | "The `ddof` parameter is a simple integer input that tells the function the number of degrees of freedom to take into account. This is a more statistical concept, but what this tells us that our matrix calculation is correct!\n", 1672 | "\n", 1673 | "A lot of this might not make sense at first glance. It helps to go back and forth between the theory and the code representations until you have a better grasp of the mathematics involved. It is definitely not necessary to be an expert on linear algebra and on matrix operations, but linear algebra can help to streamline the process of working with large amounts of data. For further reading on NumPy, check out the [documentation](https://docs.scipy.org/doc/numpy/user/index.html)." 1674 | ] 1675 | }, 1676 | { 1677 | "cell_type": "markdown", 1678 | "metadata": {}, 1679 | "source": [ 1680 | <<<<<<< HEAD 1681 | "`ddof`参数表示的是自由度。`np.var`默认的自由度为0,这里我设为1,告诉函数在自由度为1的情况下计算方差,这是一个统计学的概念,通过这个函数的计算,可以验证我们用矩阵的方法计算出的方差是没有问题的。\n", 1682 | "\n", 1683 | "今天介绍的很多东西初看似乎没有什么用,但是当你更好地掌握了这些数学概念之后,就能很好通过代码的形式去表现这些数学理论知识。我们并不要求你精通线性代数或者矩阵操作,但是线性代数能帮助优化数据的处理流程。更多的请查看NumPy使用指南:[点我查看](https://docs.scipy.org/doc/numpy/user/index.html)." 1684 | ======= 1685 | "The `ddof` parameter is a simple integer input that tells the function the number of degrees of freedom to take into account. This is a more statistical concept, but what this tells us that our matrix calculation is correct!\n", 1686 | "\n", 1687 | "A lot of this might not make sense at first glance. It helps to go back and forth between the theory and the code representations until you have a better grasp of the mathematics involved. It is definitely not necessary to be an expert on linear algebra and on matrix operations, but linear algebra can help to streamline the process of working with large amounts of data. For further reading on NumPy, check out the [documentation](https://docs.scipy.org/doc/numpy/user/index.html).\n", 1688 | "\n", 1689 | "`ddof`参数表示的是自由度`np.var`默认的自由度为0,这里我设为1,便可计算出带有相关性的方差,这是一个统计学的概念,虽然不好理解,但是能告诉我们,通过矩阵计算出的方差是没有问题的。\n", 1690 | "\n", 1691 | "今天介绍的很多东西看似没有什么用,但是当你更好地掌握了这些数学概念之后,你能很好通过代码的形式去表现这些数学理论知识。我们并不要求你精通线性代数或者矩阵操作,但是线性代数能帮助优化数据的处理流程。更多的请查看NumPy使用指南:[点我查看](https://docs.scipy.org/doc/numpy/user/index.html)." 1692 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1693 | ] 1694 | } 1695 | ], 1696 | "metadata": { 1697 | "kernelspec": { 1698 | "display_name": "Python 3", 1699 | "language": "python", 1700 | "name": "python3" 1701 | }, 1702 | "language_info": { 1703 | "codemirror_mode": { 1704 | "name": "ipython", 1705 | "version": 3 1706 | }, 1707 | "file_extension": ".py", 1708 | "mimetype": "text/x-python", 1709 | "name": "python", 1710 | "nbconvert_exporter": "python", 1711 | "pygments_lexer": "ipython3", 1712 | <<<<<<< HEAD 1713 | "version": "3.5.4" 1714 | ======= 1715 | "version": "3.6.2" 1716 | >>>>>>> 27de8dec4bde8c8ba17d90df1376c379bee91aa9 1717 | } 1718 | }, 1719 | "nbformat": 4, 1720 | "nbformat_minor": 1 1721 | } 1722 | -------------------------------------------------------------------------------- /A Professional Quant Equity Workflow.md: -------------------------------------------------------------------------------- 1 | # A Professional Quant Equity Workflow 2 | 3 | # 一个专业化的股权量化工作流程 4 | 5 | 6 | --- 7 | 8 | 9 | By Jonathan Larkin, Chief Investment Officer at Quantopian 10 | 11 | 12 | [原文链接](https://blog.quantopian.com/a-professional-quant-equity-workflow/) 13 | 14 | 15 | 在前面的[文章](https://blog.quantopian.com/the-foundation-of-algo-success/)中,我阐述了一个高夏普比率(Sharpe ratio)的量化投资策略的哲学基础. 16 | 17 | 18 | 今天我们将更加细致了解当下流行的投资流程,深入到量化投资的世界:cross-sectional equity investing,也被称作**股权统计套利**或者**股权市场中性投资**. 19 | 这种股权投资通常就是:在一个有高度风控的投资组合里持有数百只股票的多头和空头头寸,其目标是捕捉瞬间的市场异常,这种异常和市场的方向或者主要的风险因素没有相关性. 20 | 21 | 22 | 如果你想知道全世界最大的对冲基金里宽客精英是如何度过他们每一天的,请继续读下去吧: 23 | 24 | 25 | 所有的cross-sectional策略,可以被提炼成6个阶段: 26 | - Data 27 | - Universe Definition 28 | - Alpha Discovery 29 | - Alpha Combination 30 | - Portfolio Construction 31 | - Trading 32 | 33 | 每个阶段的成功是策略成功的必要条件,但不是充分条件.只有在每个阶段中审慎和周到做好决策,才能使得策略能良好的运作.流程图如下: 34 | ![image](https://github.com/kite8/Quantopian-lectures-notebook-translation/blob/master/image/workflow.jpg) 35 | 36 | ## Data (数据) ## 37 | 38 | 39 | 宽客都是数据驱动和data-informed类型的投资者.所有的量化投资都始于数据.幸运的是,Quantopian已经帮你完成了跑腿的粗活和和已经清洗过,有标记,symbol mapped, joined across vendors, and constructed, where possible, point-in-time数据集. 第一步,你需要回答一个简单的问题:"包含哪些信息的数据集能对预估未来收益更有帮助?"如果你要去挖金子,你得知道从哪开始挖. 40 | 41 | 42 | ## Universe Definition (证券池的设计) ## 43 | 44 | 今天Quantopian的历史价格数据已经涵盖了约8000只美国在交易所上市的证券.在推动策略这方面事情之前(做策略相关的事情更具吸引力),首先你必须要筛掉一部分上市证券,得到你所满意并用于交易的证券,例如,你的交易证券池.Q500US和Q1500US刚刚发布,您可以使用其中之一.或利用这些中的底层机制来帮助你风格生带有自己风格的自定义证券池(universe). 45 | 46 | 你也许会问,"为什么要做这些自我限制?难道不是使用所有可用数据这样更好吗?在你前面的文章中不是讨论过,要尽量利用"广度"吗?" 47 | 48 | 首先我们要从实际的情况出发,比如筛掉流动性不足的股票.对于筛选这件事,这里有一个不那么明显但是很关键的理由,成功的cross-sectional策略会让证券池的证券价格走势不要太相似,也不要差太远,保持一定的平衡.为了理性的对证券排名,cross-sectional策略需要衡量证券的相对价值,对用于排名维度数据需要做标准化处理.(归一化,特征缩放) 49 | 50 | 在设计证券池时,你必须有自己的投资理念.想象一下接下来两个例子的场景.如果你的策略是基于股票隔夜信息内容来进行日内交易,对于ADRs这类证券必须清理掉.当信息扩散到ADR原发行地时(存在时差),我们的策略是依赖于美国证券交易所交易价格的投资者行为,这点在逻辑是不一致的.第二个例子,如果你的策略是基于财报数据,例如应计异象,你必须剔除掉不适用这类标准的股票(在这个例子中,比如银行股). 51 | 52 | ## Alpha Discovery (Alpha挖掘/因子挖掘) ## 53 | alpha可以理解为,当证券池中股票用cross-section策略进行交易时,以每只股票的相对回报组成的一个实数向量.一个alpha可以从一个线性序列中构建,也可以是没有维度的向量构建.Alphas也被叫做因子,在Quantopian中这两种叫法都是可以的.[Pipeline API](https://www.quantopian.com/tutorials/pipeline)将带你走进alpha建模的世界.在这一阶段,不要去考虑真实世界中的因素,比如交易,佣金或者风险.创建关于投资者行为,市场结构,信息不对称或市场低效率的任何其他潜在原因的假设,并看看该假设是否具有预测能力.需要一些点子?去Google或者[SSRN search](https://papers.ssrn.com/sol3/DisplayAbstractSearch.cfm)搜"equity market anomalies"吧. 54 | 55 | Alpha研究是艺术和科学的结合,也往往会发生神奇的事情.Alpha研究是一个不断迭代的过程:提出假设,检验假设,分析问题,修正假设.我们近期发布了一个新的开源计划,目前正在Quantopian Research进行测试,被称为[`alphaens`](https://www.quantopian.com/posts/alphalens-a-new-tool-for-analyzing-alpha-factors).你可以用`Pipeline API`去表达你的alpha,用`alphalens`去评估其效用. 56 | 57 | ## Alpha Combination (Alpha聚合/因子聚合) ## 58 | 在今天的市场,很难用单因子(alpha)模型去撑起一个投资策略.一个成功的策略通常会有许多个独立的因子(alpha);如果这些因子(alpha)够厉害,只需少量的因子就足矣.本阶段的目标是将多个正则化的因子(alpha)通过加权的方式,最终得到一个单因子,这个因子比之前最好的单因子预测能力还要强.加权的方式可以很简单:有时候通过加一个排名或者对因子(alpha)进行求平均,就是一个不错的解决方案.事实上,一个很流行的模型只是把[两个因子(alpha)](https://www.amazon.com/Little-Book-Still-Beats-Market/dp/0470624159/ref=sr_1_1?ie=UTF8&qid=1471483267&sr=8-1&keywords=the+little+book+that+beats+the+market)进行聚合.随着复杂度的增加,一些经典的投资组合理论能帮上忙;比如,在因子最终聚合加权方式上可以选择[马科维茨的均值一方差组合模型(lowest possible variance)](https://www.quantopian.com/posts/the-efficient-frontier-markowitz-portfolio-optimization-in-python-using-cvxopt). 最后(敲黑板!),现代的机器学习技术(深度学习&强化学习?)能帮助我们捕捉因子间复杂的关系.将你的因子转换成特征值,然后用机器学习算法去做分类,是当前很流行的一个研究方向. 59 | 60 | ## Portfolio Construction (创建投资组合) ## 61 | 到这个阶段之前,我们都只是在做研究,没有进行实操.之前的步骤都是在研究,这项工作最好在非结构化的Quantopian [Research environment](https://www.quantopian.com/research),因为这里可以快速验证你的想法.此时问题发生了变化:我们已经有了一个最终的alpha向量,我们需要把它用于结构化的真实世界中的投资组合进行交易,获取利润.我们在每次迭代中重复如下步骤:我们计算alphas,并聚合成一个最终alpha,在前一次迭代中的投资组合基础上,用我们最终alpha去计算一个理想化的投资组合,生成一个换仓列表,将前一时刻的投资组合逐步替换成理想化的投资组合. 62 | 63 | 如何定义一个理想化的投资组合创建步骤,有许多疑问需要解答:你要如何认知你的风险(例如,你的风险模型是什么)? 在创建投资组合这一步中,你的目标函数是什么?哪些投资组合是受限的? 64 | 65 | 这三个问题总是被要求去回答的.我们今天只使用最简单的技术:根据最终alpha向量的分位数构建投资组合;比如,你多头的仓位等于权重最顶部的1/5,空头仓位等于权重最底部的1/5,通过权重的设置的,就能让投资组合的多头和空头达到一定数额的投资额,并且,多头和空头的价值是相等的. 66 | 67 | 随着投资组合的复杂度提高,回答上述三个问题也会变得更复杂. 68 | 69 | ## Trading (交易) ## 70 | 上述步骤的输出的是理想化的投资组合和将当前投资组合转换到理想化投资组合的换仓列表.交易,指的是在市场上进行交易的阶段.从你迄今为止所做的每一个选择的特点,您需要回答一些实际的操作问题:我需要怎样的交易频次?alpha的预测能力衰减周期是多少?在市场中,我们是更加被动的做的更耐心点,还是很积极且迅速的去执行?你可以通过`alphalens`查看最终alpha的持久性分析和营业额,和在[`pyfolio`](https://quantopian.github.io/pyfolio/notebooks/round_trip_example/)中查看完整策略的往返分析.如果选股策略是价格差异与自相似性之间的平衡,投资组合的构建是风险与收益之间的平衡,那么交易就会alpha衰减,显性成本,隐性成本和信息泄漏之间取得平衡. 71 | 72 | --- 73 | 74 | 你可能会问,"我必须严格按照这个流程才能获得成功吗?或者为我的算法在Quantopian上拿到一个配额(allocation)?"不一定的.我们在找寻能在回测之外依旧表现稳定的高夏普比率的策略;没有哪种流程可以完美解决这个问题.所有可能的策略构建的空间是如此之大,在此文中,我为大家勾勒出一个框架,并且为大家展示了世界上最大最成功的量化机构是如何解决这个问题的. 75 | 76 | *"To follow the path, look to the master, follow the master, walk with the master, see through the master, become the master."* 77 | 78 | 这个结构,能让你自由发挥,创造属于你自己的框架. 79 | 80 | --- 81 | 82 | [1] Sharpe Ratio is a statistical measurement of the risk adjusted performance of a portfolio, and is calculated by dividing a portfolio’s average return by the standard deviation of its returns. It shows a portfolio’s reward per unit of risk and is useful when comparing two similar portfolios. As the Sharpe Ratio increases, the better its performance. 83 | 84 | 85 | 86 | [2] Aboody, David and Even Tov, Omri and Lehavy, Reuven and Trueman, Brett, Overnight Returns and Firm-Specific Investor Sentiment (April 11, 2016). Available at SSRN: https://ssrn.com/abstract=2554010 or https://dx.doi.org/10.2139/ssrn.2554010 87 | 88 | 89 | 90 | [3] Dechow, Patricia M. and Khimich, Natalya V. and Sloan, Richard G., The Accrual Anomaly (March 22, 2011). Available at SSRN: https://ssrn.com/abstract=1793364 or https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1793364 91 | 92 | 93 | 94 | [4] Creamer, Germán G. and Freund, Yoav, Automated Trading with Boosting and Expert Weighting (April 1, 2010). Quantitative Finance, Vol. 4, No. 10, pp. 401–420. Available at SSRN: https://ssrn.com/abstract=937847 95 | 96 | 97 | 98 | [5] Huerta, Ramon and Elkan, Charles and Corbacho, Fernando, Nonlinear Support Vector Machines Can Systematically Identify Stocks with High and Low Future Returns (September 6, 2012). Algorithmic Finance (2013), 2:1, 45-58. Available at SSRN: https://ssrn.com/abstract=1930709 or https://dx.doi.org/10.2139/ssrn.1930709 99 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Quantopian-lectures-notebook-translation 2 | Quantopian lectures notebook translation 3 | -------------------------------------------------------------------------------- /data/IF888-2011.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kite8/Quantopian-lectures-notebook-translation/8994da1524dd2d3738572ee30c725a81d85c1625/data/IF888-2011.mat -------------------------------------------------------------------------------- /image/cell_mode_change.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kite8/Quantopian-lectures-notebook-translation/8994da1524dd2d3738572ee30c725a81d85c1625/image/cell_mode_change.jpg -------------------------------------------------------------------------------- /image/workflow.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kite8/Quantopian-lectures-notebook-translation/8994da1524dd2d3738572ee30c725a81d85c1625/image/workflow.jpg --------------------------------------------------------------------------------