├── README.md ├── data ├── example.csv └── gwr.csv ├── jupyter notebook └── example.ipynb └── mgtwr ├── __init__.py ├── __pycache__ ├── __init__.cpython-39.pyc ├── diagnosis.cpython-39.pyc ├── function.cpython-39.pyc ├── function_.cpython-39.pyc ├── kernel.cpython-39.pyc ├── kernelt.cpython-39.pyc ├── model.cpython-39.pyc ├── modelt.cpython-39.pyc ├── obj.cpython-39.pyc ├── objt.cpython-39.pyc ├── sel.cpython-39.pyc ├── selt.cpython-39.pyc └── setup.cpython-39.pyc ├── diagnosis.py ├── function.py ├── kernel.py ├── model.py ├── obj.py ├── sel.py └── setup.py /README.md: -------------------------------------------------------------------------------- 1 | # mgtwr 2 | 3 | To fit geographically weighted model, geographically and temporally weighted regression model and multiscale geographically and temporally weighted regression model. You can 4 | read example.ipynb to know how to use it. 5 | 6 | # model.py improve 7 | When trying to use **parallel processing**, **model.py** is the function that is called. The original code was using multiprocessing.So based on cProfile and the pstats library ran statistics, it is found **thread.lock** to be the main time consuming task,which causes parallel processing to be slower than non-parallel processing. So attempts were made to compare the processing of ThreadPoolExecutor from concurrent.futures and joblib, and ultimately joblib was found to be the fastest case.All of the result was run on the provided **example.csv**. 8 | As you can see, joblib greatly reduces the parallel processing time, and also has a large improvement compared to orginal, however, it should be noted that **{method 'acquire' of '_thread.lock' objects}** is still the most time-consuming task, and how to solve this is beyond my ability to do. 9 | 10 | Related computer configuration: 11 | 12t Gen_Intel(R) Core(TM) i5-12600KF(6+4 for core 16 of threads) 12 | Crucial 16GB DDR5-4800 UDIMM 13 | (My computer configuration isn't that low also so I'm very sad why I have to run for 11 minutes when it's only 6 minutes in the example) 14 | 15 | # result comparison: 16 | 1. Orginal(**thread=1**) 17 | **time cost: 0:11:3.748** 18 | 636805417 function calls (613137518 primitive calls) in 663.749 seconds 19 | 20 | Ordered by: internal time 21 | 22 | ncalls tottime percall cumtime percall filename:lineno(function) 23 | 2151360 169.201 0.000 225.785 0.000 d:\anaconda\Lib\site-packages\scipy\linalg\_basic.py:40(solve) 24 | 19328497/15022891 35.541 0.000 81.668 0.000 {built-in method numpy.core._multiarray_umath.implement_array_function} 25 | 26 | 2. Original(thread=15) 27 | **time cost: 0:24:29.669** 28 | 6022657 function calls (6019718 primitive calls) in 1469.671 seconds 29 | 30 | Ordered by: internal time 31 | 32 | ncalls tottime percall cumtime percall filename:lineno(function) 33 | 23663 1397.118 0.059 1397.118 0.059 {method 'acquire' of '_thread.lock' objects} 34 | 35 | 3. ThreadPoolExecutor(**thread=15**) 36 | **time cost: 0:07:44.877** 37 | 99635100 function calls (99632161 primitive calls) in 464.878 seconds 38 | 39 | Ordered by: internal time 40 | 41 | ncalls tottime percall cumtime percall filename:lineno(function) 42 | 10310899 409.240 0.000 409.240 0.000 {method 'acquire' of '_thread.lock' objects} 43 | 44 | 4. joblib(**thread=15**) 45 | time cost: 0:03:40.609 46 | 12395230 function calls (12381086 primitive calls) in 220.610 seconds 47 | 48 | Ordered by: internal time 49 | 50 | ncalls tottime percall cumtime percall filename:lineno(function) 51 | 902982 187.731 0.000 187.731 0.000 {method 'acquire' of '_thread.lock' objects} 52 | 1245 22.471 0.018 215.907 0.173 d:\anaconda\Lib\site-packages\joblib\parallel.py:960(retrieve) 53 | 403539 2.014 0.000 192.678 0.000 d:\anaconda\Lib\concurrent\futures\_base.py:428(result) 54 | 1220 1.293 0.001 215.684 0.177 C:\Users\34456\AppData\Roaming\Python\Python311\site-packages\mgtwr\model.py:450(cal_aic) 55 | -------------------------------------------------------------------------------- /data/gwr.csv: -------------------------------------------------------------------------------- 1 | longitude,latitude,x1,x2,x3,y 2 | 941396.6,3521764.0,75.6,19.9,20.76,8.2 3 | 895553.0,3471916.0,100.0,26.0,26.86,6.4 4 | 930946.4,3502787.0,61.7,24.1,15.42,6.6 5 | 745398.6,3474765.0,100.0,24.8,51.67,9.4 6 | 849431.3,3665553.0,42.7,17.5,42.39,13.3 7 | 819317.3,3807616.0,100.0,15.1,3.49,6.4 8 | 803747.1,3769623.0,64.6,14.7,11.44,9.2 9 | 699011.5,3793408.0,75.2,10.7,9.21,9.0 10 | 863020.8,3520432.0,47.0,22.0,31.33,7.6 11 | 859915.8,3466377.0,66.2,19.3,11.62,7.5 12 | 809736.9,3636468.0,16.1,19.2,41.68,17.0 13 | 844270.1,3595691.0,57.9,18.3,22.36,10.3 14 | 979288.9,3463849.0,100.0,18.2,4.58,5.8 15 | 827822.0,3421638.0,65.6,25.9,41.47,9.1 16 | 1023145.0,3554982.0,80.6,13.2,14.85,11.8 17 | 994903.4,3600493.0,63.2,27.5,25.95,19.9 18 | 971593.8,3671394.0,72.3,30.3,52.19,9.6 19 | 782448.2,3684504.0,73.4,15.6,35.48,7.2 20 | 724741.2,3492653.0,100.0,31.8,58.89,10.1 21 | 1008480.0,3437933.0,47.1,11.5,20.19,13.5 22 | 964264.9,3598842.0,52.1,24.1,30.94,9.9 23 | 678778.6,3713250.0,68.5,14.4,15.46,12.0 24 | 670055.9,3862318.0,43.6,12.0,0.91,8.1 25 | 962612.3,3432769.0,100.0,18.3,27.05,6.4 26 | 1059706.0,3556747.0,5.1,17.2,38.02,18.6 27 | 704959.2,3577608.0,13.7,10.4,30.94,20.2 28 | 653026.6,3813760.0,77.4,14.6,8.61,5.9 29 | 734240.9,3794110.0,57.8,6.1,1.77,18.4 30 | 832508.6,3762905.0,17.6,27.0,26.23,37.5 31 | 695793.9,3495219.0,100.0,35.7,60.76,11.2 32 | 745538.8,3711726.0,4.4,8.6,23.82,14.7 33 | 908046.1,3428340.0,58.6,26.4,27.29,6.7 34 | 724646.8,3757187.0,5.8,5.6,9.84,33.0 35 | 894463.9,3492465.0,64.6,22.5,25.46,11.1 36 | 808691.8,3455994.0,59.4,22.8,24.16,10.0 37 | 942527.9,3722100.0,30.6,6.6,10.93,23.9 38 | 839816.1,3449007.0,62.0,22.4,29.94,6.5 39 | 705457.9,3694344.0,76.1,11.4,22.59,13.3 40 | 783416.5,3623343.0,100.0,14.0,30.66,5.7 41 | 805648.4,3537103.0,48.4,29.0,40.66,10.0 42 | 635964.3,3854592.0,96.5,14.6,0.35,8.0 43 | 764386.1,3812502.0,100.0,12.8,0.29,8.6 44 | 732628.4,3421800.0,58.0,23.3,39.47,11.7 45 | 759231.9,3735253.0,2.5,9.9,42.23,32.7 46 | 860451.4,3569933.0,70.7,21.8,27.64,8.0 47 | 800031.3,3564188.0,72.6,32.9,48.98,9.5 48 | 764116.9,3494367.0,10.0,24.4,50.15,17.0 49 | 707288.7,3731361.0,26.7,6.6,7.63,12.0 50 | 703495.1,3467152.0,52.8,31.4,44.09,9.4 51 | 896654.0,3401148.0,100.0,14.6,11.48,4.7 52 | 1031899.0,3596117.0,89.1,12.7,14.03,7.6 53 | 879541.2,3785425.0,70.0,19.7,29.99,8.0 54 | 943066.2,3616602.0,64.2,25.7,32.58,9.1 55 | 981727.8,3571315.0,100.0,25.4,33.88,8.6 56 | 739255.8,3866604.0,100.0,17.2,0.03,7.8 57 | 731468.7,3700612.0,53.9,2.6,5.13,25.8 58 | 662257.4,3789664.0,36.1,13.6,13.56,13.7 59 | 765397.3,3789005.0,93.7,6.8,0.0,15.6 60 | 845701.3,3813323.0,87.2,16.5,9.89,9.5 61 | 733728.4,3733248.0,4.2,18.4,49.92,31.6 62 | 732702.3,3844809.0,100.0,16.6,0.26,8.6 63 | 908386.8,3685752.0,100.0,16.8,12.69,5.3 64 | 1023411.0,3471063.0,20.3,14.3,25.57,19.9 65 | 695325.1,3822135.0,79.7,11.1,3.78,9.2 66 | 765058.1,3421817.0,55.4,22.3,31.5,7.7 67 | 855577.3,3722330.0,75.7,25.1,49.89,8.8 68 | 772634.6,3764306.0,13.6,4.0,5.11,29.6 69 | 818917.1,3839931.0,88.5,11.6,5.42,12.0 70 | 794419.5,3803344.0,81.1,10.6,8.48,15.4 71 | 873518.8,3689861.0,100.0,30.1,79.64,6.8 72 | 665933.8,3740622.0,67.8,14.4,6.47,7.5 73 | 695500.6,3624790.0,95.8,13.7,25.49,13.6 74 | 870749.9,3810303.0,73.8,14.2,20.41,9.1 75 | 675280.4,3685569.0,100.0,19.1,13.38,5.7 76 | 763488.4,3699716.0,76.0,6.1,10.24,10.7 77 | 814118.9,3590553.0,20.9,10.6,21.8,16.0 78 | 855461.8,3506293.0,63.4,27.2,30.5,8.3 79 | 815753.1,3783949.0,78.0,14.1,9.58,9.0 80 | 807249.1,3695092.0,100.0,17.4,34.8,10.8 81 | 915741.9,3530869.0,65.1,18.8,15.36,8.3 82 | 924108.1,3668080.0,100.0,31.3,55.92,6.2 83 | 970465.7,3640263.0,53.8,27.8,41.51,7.7 84 | 908636.7,3624562.0,100.0,22.2,33.89,4.9 85 | 821367.1,3660143.0,81.9,10.8,25.6,12.0 86 | 766461.7,3663959.0,63.6,16.3,34.03,10.0 87 | 873804.3,3439981.0,100.0,25.9,26.58,5.4 88 | 884830.4,3599291.0,52.9,20.5,33.32,12.0 89 | 770455.5,3520161.0,78.2,12.6,19.22,13.7 90 | 1014742.0,3537225.0,32.9,17.2,39.15,13.4 91 | 919396.5,3752562.0,100.0,17.8,38.19,8.2 92 | 1004544.0,3517834.0,100.0,23.7,21.75,5.2 93 | 864781.1,3419313.0,47.6,19.9,31.88,16.3 94 | 772600.0,3832429.0,78.6,15.3,1.41,11.1 95 | 917730.9,3716368.0,65.9,21.6,36.38,10.4 96 | 1030500.0,3500535.0,100.0,22.3,43.34,8.7 97 | 777055.3,3584821.0,65.6,29.2,58.72,10.1 98 | 848638.8,3785405.0,100.0,15.7,8.32,9.7 99 | 732876.8,3584393.0,100.0,28.2,41.32,4.6 100 | 715359.8,3660275.0,82.3,22.4,44.62,6.7 101 | 716369.8,3451034.0,100.0,22.1,27.48,8.2 102 | 766238.6,3453930.0,56.2,28.7,47.91,7.8 103 | 790338.7,3660608.0,75.1,13.8,31.78,12.9 104 | 920887.4,3568473.0,98.6,24.5,28.27,10.1 105 | 825920.1,3717990.0,73.0,15.0,34.74,11.0 106 | 707834.3,3854188.0,89.0,11.3,0.26,5.5 107 | 700833.7,3598228.0,3.2,18.6,37.95,16.6 108 | 793263.9,3719734.0,76.0,14.4,22.35,9.5 109 | 830735.9,3750903.0,95.2,7.9,7.37,28.4 110 | 863291.8,3756777.0,100.0,16.2,24.74,12.8 111 | 695329.2,3758093.0,93.7,8.8,3.94,7.6 112 | 798061.4,3609091.0,61.3,24.0,47.53,15.2 113 | 733846.7,3812828.0,100.0,12.8,1.48,9.0 114 | 953533.8,3482044.0,74.4,21.3,11.69,6.3 115 | 744180.8,3665561.0,100.0,13.4,20.04,9.3 116 | 668031.4,3764766.0,66.5,16.3,14.3,6.8 117 | 833819.6,3567447.0,56.5,24.3,32.46,10.7 118 | 840169.1,3695254.0,66.5,16.4,32.79,11.7 119 | 686875.4,3524124.0,100.0,33.0,49.93,7.3 120 | 824645.5,3864805.0,100.0,13.6,0.35,11.6 121 | 712437.1,3519627.0,53.5,35.9,58.17,6.0 122 | 954272.3,3697862.0,9.9,18.2,41.96,17.3 123 | 777759.0,3729605.0,59.2,6.2,8.03,18.1 124 | 752973.1,3570222.0,100.0,19.9,34.09,8.0 125 | 1004028.0,3641918.0,79.3,22.9,44.69,8.6 126 | 704495.6,3422002.0,69.4,29.1,32.74,7.8 127 | 754916.2,3685029.0,53.6,15.6,29.08,11.1 128 | 842085.9,3827075.0,64.5,17.0,11.81,13.1 129 | 703256.8,3552857.0,100.0,31.4,63.46,8.0 130 | 763457.1,3551752.0,45.4,24.8,46.53,15.9 131 | 734217.9,3623162.0,97.9,24.9,62.34,7.1 132 | 884376.9,3717493.0,100.0,31.9,61.36,5.6 133 | 963427.8,3560039.0,79.3,21.9,29.19,6.5 134 | 759410.8,3608179.0,100.0,29.5,43.21,7.1 135 | 882069.4,3534470.0,72.6,27.3,34.45,8.6 136 | 743031.8,3522636.0,50.3,29.1,59.9,9.2 137 | 795506.2,3421725.0,55.2,22.6,37.93,13.4 138 | 831682.3,3487715.0,51.1,22.9,26.68,14.0 139 | 941734.4,3567586.0,35.7,24.0,23.38,11.4 140 | 797981.7,3872640.0,100.0,14.0,0.0,11.4 141 | 919077.6,3595170.0,53.3,27.1,33.1,6.3 142 | 682616.8,3660254.0,44.0,16.3,30.03,13.6 143 | 819399.6,3514927.0,44.5,31.3,40.66,7.2 144 | 832935.0,3623868.0,100.0,26.0,45.93,4.8 145 | 777040.1,3858779.0,100.0,18.3,0.1,10.1 146 | 752165.2,3639192.0,65.3,14.7,27.78,9.0 147 | 658870.4,3842167.0,44.8,12.8,3.73,8.4 148 | 800384.3,3742691.0,61.2,13.2,18.37,9.4 149 | 938349.6,3446675.0,54.2,21.1,25.88,10.4 150 | 902471.1,3699878.0,100.0,32.6,60.23,4.2 151 | 894704.3,3648583.0,67.1,21.6,51.86,9.8 152 | 986832.8,3494323.0,59.9,21.2,19.45,9.6 153 | 731576.3,3544716.0,100.0,22.5,50.2,5.5 154 | 898776.3,3563384.0,100.0,30.3,30.06,8.6 155 | 796905.6,3841086.0,100.0,12.5,2.59,13.6 156 | 686891.4,3855274.0,70.0,11.1,4.06,12.0 157 | 838551.5,3538547.0,100.0,28.6,31.76,7.6 158 | 891228.5,3749769.0,59.6,22.6,45.94,10.4 159 | 858796.9,3637891.0,100.0,15.3,41.99,8.8 160 | 801018.1,3487328.0,71.1,26.2,30.71,6.3 161 | -------------------------------------------------------------------------------- /jupyter notebook/example.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "2d479b4a", 6 | "metadata": { 7 | "ExecuteTime": { 8 | "end_time": "2022-09-19T07:47:51.787237Z", 9 | "start_time": "2022-09-19T07:47:51.778516Z" 10 | } 11 | }, 12 | "source": [ 13 | "Read data" 14 | ] 15 | }, 16 | { 17 | "cell_type": "code", 18 | "execution_count": 1, 19 | "id": "4b06eeb4", 20 | "metadata": { 21 | "ExecuteTime": { 22 | "end_time": "2022-09-19T08:09:42.796617Z", 23 | "start_time": "2022-09-19T08:09:42.434654Z" 24 | } 25 | }, 26 | "outputs": [], 27 | "source": [ 28 | "import pandas as pd" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": 2, 34 | "id": "7d65c606", 35 | "metadata": { 36 | "ExecuteTime": { 37 | "end_time": "2022-09-19T08:09:43.559791Z", 38 | "start_time": "2022-09-19T08:09:43.539778Z" 39 | } 40 | }, 41 | "outputs": [], 42 | "source": [ 43 | "data = pd.read_csv(r'\\data\\example.csv')" 44 | ] 45 | }, 46 | { 47 | "cell_type": "code", 48 | "execution_count": 3, 49 | "id": "796fe3e5", 50 | "metadata": { 51 | "ExecuteTime": { 52 | "end_time": "2022-09-19T08:09:44.798946Z", 53 | "start_time": "2022-09-19T08:09:44.787275Z" 54 | } 55 | }, 56 | "outputs": [], 57 | "source": [ 58 | "coords = data[['longitude', 'latitude']]\n", 59 | "t = data[['t']]\n", 60 | "X = data[['x1', 'x2']]\n", 61 | "y = data[['y']]" 62 | ] 63 | }, 64 | { 65 | "cell_type": "markdown", 66 | "id": "905c6002", 67 | "metadata": {}, 68 | "source": [ 69 | "GWR model" 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": 4, 75 | "id": "532ffaf3", 76 | "metadata": { 77 | "ExecuteTime": { 78 | "end_time": "2022-09-19T08:09:50.207629Z", 79 | "start_time": "2022-09-19T08:09:50.131373Z" 80 | } 81 | }, 82 | "outputs": [], 83 | "source": [ 84 | "from mgtwr.sel import SearchGWRParameter\n", 85 | "from mgtwr.model import GWR" 86 | ] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "execution_count": 5, 91 | "id": "e0aa3e40", 92 | "metadata": { 93 | "ExecuteTime": { 94 | "end_time": "2022-09-19T08:09:54.705874Z", 95 | "start_time": "2022-09-19T08:09:53.355340Z" 96 | } 97 | }, 98 | "outputs": [ 99 | { 100 | "name": "stdout", 101 | "output_type": "stream", 102 | "text": [ 103 | "bw: 15.0 , score: 18778.49\n", 104 | "bw: 10.0 , score: 18764.75\n", 105 | "bw: 6.0 , score: 18699.21\n", 106 | "bw: 4.0 , score: 18506.22\n", 107 | "bw: 2.0 , score: 17786.86\n", 108 | "bw: 2.0 , score: 17786.86\n", 109 | "time cost: 0:00:1.934\n" 110 | ] 111 | } 112 | ], 113 | "source": [ 114 | "sel = SearchGWRParameter(coords, X, y, kernel='gaussian', fixed=True)\n", 115 | "bw = sel.search(bw_max=40, verbose=True, time_cost=True)" 116 | ] 117 | }, 118 | { 119 | "cell_type": "code", 120 | "execution_count": 6, 121 | "id": "cb3be837", 122 | "metadata": { 123 | "ExecuteTime": { 124 | "end_time": "2022-09-19T08:10:32.986328Z", 125 | "start_time": "2022-09-19T08:10:32.709532Z" 126 | } 127 | }, 128 | "outputs": [ 129 | { 130 | "name": "stdout", 131 | "output_type": "stream", 132 | "text": [ 133 | "0.5935790327518\n" 134 | ] 135 | } 136 | ], 137 | "source": [ 138 | "gwr = GWR(coords, X, y, bw, kernel='gaussian', fixed=True).fit()\n", 139 | "print(gwr.R2)" 140 | ] 141 | }, 142 | { 143 | "cell_type": "markdown", 144 | "id": "ac6e9d39", 145 | "metadata": {}, 146 | "source": [ 147 | "MGWR model" 148 | ] 149 | }, 150 | { 151 | "cell_type": "code", 152 | "execution_count": 7, 153 | "id": "20f580b3", 154 | "metadata": {}, 155 | "outputs": [], 156 | "source": [ 157 | "from mgtwr.sel import SearchMGWRParameter\n", 158 | "from mgtwr.model import MGWR" 159 | ] 160 | }, 161 | { 162 | "cell_type": "code", 163 | "execution_count": 8, 164 | "id": "08bf65d5", 165 | "metadata": {}, 166 | "outputs": [ 167 | { 168 | "name": "stdout", 169 | "output_type": "stream", 170 | "text": [ 171 | "Current iteration: 1 ,SOC: 0.0033171\n", 172 | "Bandwidths: 986.8, 965.5, 0.7\n", 173 | "Current iteration: 2 ,SOC: 5.64e-05\n", 174 | "Bandwidths: 986.8, 986.8, 0.7\n", 175 | "Current iteration: 3 ,SOC: 4.27e-05\n", 176 | "Bandwidths: 986.8, 986.8, 0.7\n", 177 | "Current iteration: 4 ,SOC: 3.22e-05\n", 178 | "Bandwidths: 986.8, 986.8, 0.7\n", 179 | "Current iteration: 5 ,SOC: 2.43e-05\n", 180 | "Bandwidths: 986.8, 986.8, 0.7\n", 181 | "time cost: 0:00:35.14\n" 182 | ] 183 | } 184 | ], 185 | "source": [ 186 | "sel_multi = SearchMGWRParameter(coords, X, y, kernel='gaussian', fixed=True)\n", 187 | "bws = sel_multi.search(multi_bw_max=[1000], verbose=True, time_cost=True, tol_multi=3.0e-5)" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": 9, 193 | "id": "e7dbf9fb", 194 | "metadata": {}, 195 | "outputs": [ 196 | { 197 | "name": "stdout", 198 | "output_type": "stream", 199 | "text": [ 200 | "0.7045779853867871\n" 201 | ] 202 | } 203 | ], 204 | "source": [ 205 | "mgwr = MGWR(coords, X, y, sel_multi, kernel='gaussian', fixed=True).fit()\n", 206 | "print(mgwr.R2)" 207 | ] 208 | }, 209 | { 210 | "cell_type": "markdown", 211 | "id": "68c915f1", 212 | "metadata": {}, 213 | "source": [ 214 | "If you already know bws, you can also do the following" 215 | ] 216 | }, 217 | { 218 | "cell_type": "code", 219 | "execution_count": 10, 220 | "id": "56555609", 221 | "metadata": {}, 222 | "outputs": [ 223 | { 224 | "name": "stdout", 225 | "output_type": "stream", 226 | "text": [ 227 | "0.7045779853867871\n" 228 | ] 229 | } 230 | ], 231 | "source": [ 232 | "class sel_multi:\n", 233 | " def __init__(self, bws):\n", 234 | " self.bws = bws\n", 235 | "\n", 236 | " \n", 237 | "selector = sel_multi(bws)\n", 238 | "mgwr = MGWR(coords, X, y, selector, kernel='gaussian', fixed=True).fit()\n", 239 | "print(mgwr.R2)" 240 | ] 241 | }, 242 | { 243 | "cell_type": "markdown", 244 | "id": "6aea4108", 245 | "metadata": { 246 | "ExecuteTime": { 247 | "end_time": "2022-09-19T08:11:21.337967Z", 248 | "start_time": "2022-09-19T08:11:21.326547Z" 249 | } 250 | }, 251 | "source": [ 252 | "GTWR model" 253 | ] 254 | }, 255 | { 256 | "cell_type": "code", 257 | "execution_count": 11, 258 | "id": "462da66a", 259 | "metadata": { 260 | "ExecuteTime": { 261 | "end_time": "2022-09-19T08:11:53.026336Z", 262 | "start_time": "2022-09-19T08:11:53.021405Z" 263 | } 264 | }, 265 | "outputs": [], 266 | "source": [ 267 | "from mgtwr.sel import SearchGTWRParameter\n", 268 | "from mgtwr.model import GTWR" 269 | ] 270 | }, 271 | { 272 | "cell_type": "code", 273 | "execution_count": 12, 274 | "id": "4f9cc821", 275 | "metadata": { 276 | "ExecuteTime": { 277 | "end_time": "2022-09-19T08:14:07.489058Z", 278 | "start_time": "2022-09-19T08:13:28.866324Z" 279 | } 280 | }, 281 | "outputs": [ 282 | { 283 | "name": "stdout", 284 | "output_type": "stream", 285 | "text": [ 286 | "bw: 5.9 , tau: 19.9 , score: 18095.04059255282\n", 287 | "bw: 3.7 , tau: 19.9 , score: 17608.38596885707\n", 288 | "bw: 2.3 , tau: 10.1 , score: 16461.58709937909\n", 289 | "bw: 1.4 , tau: 3.8 , score: 14817.811620052908\n", 290 | "bw: 0.9 , tau: 1.4 , score: 13780.792562049754\n", 291 | "bw: 0.9 , tau: 1.4 , score: 13780.792562049754\n", 292 | "bw: 0.9 , tau: 1.4 , score: 13780.792562049754\n", 293 | "bw: 0.9 , tau: 1.4 , score: 13780.792562049754\n", 294 | "bw: 0.9 , tau: 1.4 , score: 13780.792562049754\n", 295 | "time cost: 0:00:40.776\n" 296 | ] 297 | } 298 | ], 299 | "source": [ 300 | "sel = SearchGTWRParameter(coords, t, X, y, kernel='gaussian', fixed=True)\n", 301 | "bw, tau = sel.search(tau_max=20, verbose=True, time_cost=True)" 302 | ] 303 | }, 304 | { 305 | "cell_type": "code", 306 | "execution_count": 13, 307 | "id": "4bbf93f8", 308 | "metadata": { 309 | "ExecuteTime": { 310 | "end_time": "2022-09-19T08:14:17.776587Z", 311 | "start_time": "2022-09-19T08:14:17.313360Z" 312 | } 313 | }, 314 | "outputs": [ 315 | { 316 | "name": "stdout", 317 | "output_type": "stream", 318 | "text": [ 319 | "0.9829884630503501\n" 320 | ] 321 | } 322 | ], 323 | "source": [ 324 | "gtwr = GTWR(coords, t, X, y, bw, tau, kernel='gaussian', fixed=True).fit()\n", 325 | "print(gtwr.R2)" 326 | ] 327 | }, 328 | { 329 | "cell_type": "markdown", 330 | "id": "2ad9399f", 331 | "metadata": {}, 332 | "source": [ 333 | "MGTWR model" 334 | ] 335 | }, 336 | { 337 | "cell_type": "code", 338 | "execution_count": 14, 339 | "id": "7d015f1a", 340 | "metadata": { 341 | "ExecuteTime": { 342 | "end_time": "2022-09-19T08:15:02.313810Z", 343 | "start_time": "2022-09-19T08:15:02.303789Z" 344 | } 345 | }, 346 | "outputs": [], 347 | "source": [ 348 | "from mgtwr.sel import SearchMGTWRParameter\n", 349 | "from mgtwr.model import MGTWR" 350 | ] 351 | }, 352 | { 353 | "cell_type": "code", 354 | "execution_count": 15, 355 | "id": "94d738b5", 356 | "metadata": { 357 | "ExecuteTime": { 358 | "end_time": "2022-09-19T08:23:08.330524Z", 359 | "start_time": "2022-09-19T08:15:42.813827Z" 360 | } 361 | }, 362 | "outputs": [ 363 | { 364 | "name": "stdout", 365 | "output_type": "stream", 366 | "text": [ 367 | "Current iteration: 1 ,SOC: 0.0025274\n", 368 | "Bandwidths: 0.7, 0.7, 0.5\n", 369 | "taus: 1.3,0.8,0.8\n", 370 | "Current iteration: 2 ,SOC: 0.0011033\n", 371 | "Bandwidths: 0.9, 0.7, 0.5\n", 372 | "taus: 3.0,0.4,0.8\n", 373 | "Current iteration: 3 ,SOC: 0.0005365\n", 374 | "Bandwidths: 0.9, 0.7, 0.5\n", 375 | "taus: 3.4,0.2,0.8\n", 376 | "Current iteration: 4 ,SOC: 0.0003\n", 377 | "Bandwidths: 0.9, 0.7, 0.5\n", 378 | "taus: 3.4,0.2,0.8\n", 379 | "Current iteration: 5 ,SOC: 0.0001986\n", 380 | "Bandwidths: 0.9, 0.7, 0.5\n", 381 | "taus: 3.6,0.2,0.8\n", 382 | "Current iteration: 6 ,SOC: 0.0001415\n", 383 | "Bandwidths: 0.9, 0.7, 0.5\n", 384 | "taus: 3.6,0.2,0.8\n", 385 | "Current iteration: 7 ,SOC: 0.0001052\n", 386 | "Bandwidths: 0.9, 0.7, 0.5\n", 387 | "taus: 3.6,0.2,0.8\n", 388 | "Current iteration: 8 ,SOC: 7.99e-05\n", 389 | "Bandwidths: 0.9, 0.7, 0.5\n", 390 | "taus: 3.6,0.2,0.8\n", 391 | "time cost: 0:06:2.651\n" 392 | ] 393 | } 394 | ], 395 | "source": [ 396 | "sel_multi = SearchMGTWRParameter(coords, t, X, y, kernel='gaussian', fixed=True)\n", 397 | "bws = sel_multi.search(multi_bw_min=[0.1], verbose=True, tol_multi=1.0e-4, time_cost=True)" 398 | ] 399 | }, 400 | { 401 | "cell_type": "code", 402 | "execution_count": 16, 403 | "id": "51401611", 404 | "metadata": { 405 | "ExecuteTime": { 406 | "end_time": "2022-09-19T08:24:31.131209Z", 407 | "start_time": "2022-09-19T08:24:16.718379Z" 408 | } 409 | }, 410 | "outputs": [ 411 | { 412 | "name": "stdout", 413 | "output_type": "stream", 414 | "text": [ 415 | "0.9972924820674222\n" 416 | ] 417 | } 418 | ], 419 | "source": [ 420 | "mgtwr = MGTWR(coords, t, X, y, sel_multi, kernel='gaussian', fixed=True).fit()\n", 421 | "print(mgtwr.R2)" 422 | ] 423 | }, 424 | { 425 | "cell_type": "markdown", 426 | "id": "541bdcce", 427 | "metadata": {}, 428 | "source": [ 429 | "If you already know bws, you can also do the following" 430 | ] 431 | }, 432 | { 433 | "cell_type": "code", 434 | "execution_count": 17, 435 | "id": "bcfc1992", 436 | "metadata": { 437 | "ExecuteTime": { 438 | "end_time": "2022-09-19T08:25:21.934146Z", 439 | "start_time": "2022-09-19T08:25:08.333204Z" 440 | } 441 | }, 442 | "outputs": [ 443 | { 444 | "name": "stdout", 445 | "output_type": "stream", 446 | "text": [ 447 | "0.9972924820674222\n" 448 | ] 449 | } 450 | ], 451 | "source": [ 452 | "class sel_multi:\n", 453 | " def __init__(self, bws):\n", 454 | " self.bws = bws\n", 455 | "\n", 456 | " \n", 457 | "selector = sel_multi(bws)\n", 458 | "mgtwr = MGTWR(coords, t, X, y, selector, kernel='gaussian', fixed=True).fit()\n", 459 | "print(mgtwr.R2)" 460 | ] 461 | }, 462 | { 463 | "cell_type": "code", 464 | "execution_count": null, 465 | "id": "a0534878", 466 | "metadata": {}, 467 | "outputs": [], 468 | "source": [] 469 | } 470 | ], 471 | "metadata": { 472 | "kernelspec": { 473 | "display_name": "Python 3 (ipykernel)", 474 | "language": "python", 475 | "name": "python3" 476 | }, 477 | "language_info": { 478 | "codemirror_mode": { 479 | "name": "ipython", 480 | "version": 3 481 | }, 482 | "file_extension": ".py", 483 | "mimetype": "text/x-python", 484 | "name": "python", 485 | "nbconvert_exporter": "python", 486 | "pygments_lexer": "ipython3", 487 | "version": "3.9.12" 488 | } 489 | }, 490 | "nbformat": 4, 491 | "nbformat_minor": 5 492 | } 493 | -------------------------------------------------------------------------------- /mgtwr/__init__.py: -------------------------------------------------------------------------------- 1 | __version__ = '2.0.5' 2 | -------------------------------------------------------------------------------- /mgtwr/__pycache__/__init__.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunkun1997/mgtwr/5680ee85d6f010a9f34c3d611c51d95a9938c4db/mgtwr/__pycache__/__init__.cpython-39.pyc -------------------------------------------------------------------------------- /mgtwr/__pycache__/diagnosis.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunkun1997/mgtwr/5680ee85d6f010a9f34c3d611c51d95a9938c4db/mgtwr/__pycache__/diagnosis.cpython-39.pyc -------------------------------------------------------------------------------- /mgtwr/__pycache__/function.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunkun1997/mgtwr/5680ee85d6f010a9f34c3d611c51d95a9938c4db/mgtwr/__pycache__/function.cpython-39.pyc -------------------------------------------------------------------------------- /mgtwr/__pycache__/function_.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunkun1997/mgtwr/5680ee85d6f010a9f34c3d611c51d95a9938c4db/mgtwr/__pycache__/function_.cpython-39.pyc -------------------------------------------------------------------------------- /mgtwr/__pycache__/kernel.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunkun1997/mgtwr/5680ee85d6f010a9f34c3d611c51d95a9938c4db/mgtwr/__pycache__/kernel.cpython-39.pyc -------------------------------------------------------------------------------- /mgtwr/__pycache__/kernelt.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunkun1997/mgtwr/5680ee85d6f010a9f34c3d611c51d95a9938c4db/mgtwr/__pycache__/kernelt.cpython-39.pyc -------------------------------------------------------------------------------- /mgtwr/__pycache__/model.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunkun1997/mgtwr/5680ee85d6f010a9f34c3d611c51d95a9938c4db/mgtwr/__pycache__/model.cpython-39.pyc -------------------------------------------------------------------------------- /mgtwr/__pycache__/modelt.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunkun1997/mgtwr/5680ee85d6f010a9f34c3d611c51d95a9938c4db/mgtwr/__pycache__/modelt.cpython-39.pyc -------------------------------------------------------------------------------- /mgtwr/__pycache__/obj.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunkun1997/mgtwr/5680ee85d6f010a9f34c3d611c51d95a9938c4db/mgtwr/__pycache__/obj.cpython-39.pyc -------------------------------------------------------------------------------- /mgtwr/__pycache__/objt.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunkun1997/mgtwr/5680ee85d6f010a9f34c3d611c51d95a9938c4db/mgtwr/__pycache__/objt.cpython-39.pyc -------------------------------------------------------------------------------- /mgtwr/__pycache__/sel.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunkun1997/mgtwr/5680ee85d6f010a9f34c3d611c51d95a9938c4db/mgtwr/__pycache__/sel.cpython-39.pyc -------------------------------------------------------------------------------- /mgtwr/__pycache__/selt.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunkun1997/mgtwr/5680ee85d6f010a9f34c3d611c51d95a9938c4db/mgtwr/__pycache__/selt.cpython-39.pyc -------------------------------------------------------------------------------- /mgtwr/__pycache__/setup.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sunkun1997/mgtwr/5680ee85d6f010a9f34c3d611c51d95a9938c4db/mgtwr/__pycache__/setup.cpython-39.pyc -------------------------------------------------------------------------------- /mgtwr/diagnosis.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | 4 | def get_AICc(gtwr): 5 | """ 6 | Get AICc value 7 | 8 | Gaussian: p61, (2.33), Fotheringham, Brunsdon and Charlton (2002) 9 | 10 | GWGLM: AICc=AIC+2k(k+1)/(n-k-1), okay et al. (2005): p2704, (36) 11 | 12 | """ 13 | n = gtwr.n 14 | k = gtwr.tr_S 15 | 16 | aicc = get_AIC(gtwr) + 2.0 * k * (k + 1.0) / (n - k - 1.0) 17 | return aicc 18 | 19 | 20 | def get_AIC(gtwr): 21 | """ 22 | Get AIC value 23 | 24 | Gaussian: p96, (4.22), Fotheringham, Brunsdon and Charlton (2002) 25 | 26 | GWGLM: AIC(G)=D(G) + 2K(G), where D and K denote the deviance and the effective 27 | number of parameters in the model with bandwidth G, respectively. 28 | 29 | """ 30 | 31 | k = gtwr.tr_S 32 | 33 | aic = -2.0 * gtwr.llf + 2.0 * (k + 1) 34 | 35 | return aic 36 | 37 | 38 | def get_BIC(gtwr): 39 | """ 40 | Get BIC value 41 | 42 | Gaussian: p61 (2.34), Fotheringham, Brunsdon and Charlton (2002) 43 | BIC = -2log(L)+k*log(n) 44 | 45 | GWGLM: BIC = dev + tr_S * log(n) 46 | 47 | """ 48 | n = gtwr.n # (scalar) number of observations 49 | k = gtwr.tr_S 50 | 51 | bic = -2.0 * gtwr.llf + (k + 1) * np.log(n) 52 | return bic 53 | 54 | 55 | def get_CV(gtwr): 56 | """ 57 | Get CV value 58 | 59 | Gaussian only 60 | 61 | Methods: p60, (2.31) or p212 (9.4) 62 | Fotheringham, A. S., Brunsdon, C., & Charleston, M. (2002). 63 | Geographically weighted regression: the analysis of spatially varying relationships. 64 | Modification: sum of residual squared is divided by n according to GWR4 results 65 | 66 | """ 67 | 68 | cv = gtwr.aa / gtwr.n 69 | return cv 70 | 71 | 72 | def corr(cov): 73 | 74 | invsd = np.diag(1 / np.sqrt(np.diag(cov))) 75 | cors = np.dot(np.dot(invsd, cov), invsd) 76 | return cors 77 | -------------------------------------------------------------------------------- /mgtwr/function.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from scipy import linalg 3 | import time 4 | from typing import Callable 5 | from copy import deepcopy 6 | 7 | 8 | def print_time(func: Callable): 9 | def inner(*args, **kwargs): 10 | start = time.time() 11 | res = func(*args, **kwargs) 12 | end = time.time() 13 | m, s = divmod(end - start, 60) 14 | h, m = divmod(m, 60) 15 | if 'time_cost' in kwargs and kwargs['time_cost']: 16 | print("time cost: %d:%02d:%s" % (h, m, round(s, 3))) 17 | return res 18 | return inner 19 | 20 | 21 | def _compute_betas_gwr(y, x, wi): 22 | """ 23 | compute MLE coefficients using iwls routine 24 | 25 | Methods: p189, Iteratively (Re)weighted Least Squares (IWLS), 26 | Fotheringham, A. S., Brunsdon, C., & Charlton, M. (2002). 27 | Geographically weighted regression: the analysis of spatially varying relationships. 28 | """ 29 | xt = (x * wi).T 30 | xtx = np.dot(xt, x) 31 | xtx_inv_xt = linalg.solve(xtx, xt) 32 | betas = np.dot(xtx_inv_xt, y) 33 | return betas, xtx_inv_xt 34 | 35 | 36 | def surface_to_plane( 37 | longitude: np.ndarray, 38 | latitude: np.ndarray, 39 | central_longitude: int = 114 40 | ): 41 | 42 | r""" 43 | base on Gauss-Kruger projection 44 | 45 | equatorial radius: a = 6378136.49m 46 | 47 | polar radius: b = 6356755m 48 | 49 | so that 50 | 51 | first eccentricity :math:`e = \sqrt{a^2-b^2}/a` 52 | 53 | second eccentricity :math:`e' = \sqrt{a^2-b^2}/b` 54 | 55 | so that 56 | 57 | .. math:: 58 | \begin{aligned} 59 | Y_{b0}=a^2B\beta_0/b + 60 | sin(B)\left(\beta_2cos(B)+\beta_4cos^3(B)+\beta_6cos^5(B)+\beta_8cos^7(B)\right) 61 | \end{aligned} 62 | where B is the latitude converted from degrees to radians and 63 | 64 | .. math:: 65 | \begin{aligned} 66 | \beta_0 &= 1-\frac{3}{4}e'^2+\frac{45}{64}e'^4-\frac{175}{256}e'^6+ 67 | \frac{11025}{16384}e'^8 \\ 68 | \beta_2 &= \beta_0 - 1 \\ 69 | \beta_4 &= \frac{15}{32}e'^4-\frac{175}{384}e'^6+\frac{3675}{8192}e'^8 \\ 70 | \beta_6 &= -\frac{35}{96}e'^6 + \frac{735}{2048}e'^8 \\ 71 | \beta_8 &= \frac{315}{1024}e'^8 \\ 72 | \end{aligned} 73 | 74 | so that the Y-axis is 75 | 76 | .. math:: 77 | \begin{aligned} 78 | Y &= Y_{b0}+\frac{1}{2}Ntan(B)m^2+\frac{1}{24}\left(5-tan^2(B)+9\eta^2+4\eta^4 79 | \right)Ntan(B)m^4 \\ 80 | &+ \frac{1}{720}\left(61-58tan^2(B)\right)Ntan(B)m^6 81 | \end{aligned} 82 | where L is the longitude subtracts the central longitude converted to radians and 83 | 84 | .. math:: 85 | \begin{aligned} 86 | N &= a/\sqrt{1-(esin(B))^2} \\ 87 | \eta &= e'cos(B) \\ 88 | m &= Lcos(B) \\ 89 | \end{aligned} 90 | so that the X_axis is 91 | 92 | .. math:: 93 | \begin{aligned} 94 | X &= Nm+\frac{1}{6}\left(1-tan^2(B)+\eta^2\right)Nm^3 \\ 95 | &+ \frac{1}{120}\left(5-18tan^2(B)+tan^4(B)+14\eta^2-58tan^2(B)\eta\right)Nm^5+500000 96 | \end{aligned} 97 | """ 98 | a = 6378136.49 99 | b = 6356755 100 | 101 | e1 = np.sqrt(a ** 2 - b ** 2) / a 102 | e2 = np.sqrt(a ** 2 - b ** 2) / b 103 | beta0 = 1 - (3 / 4) * e2 ** 2 + (45 / 64) * e2 ** 4 - (175 / 256) * e2 ** 6 \ 104 | + (11025 / 16384) * e2 ** 8 105 | beta2 = beta0 - 1 106 | beta4 = (15 / 32) * e2 ** 4 - (175 / 384) * e2 ** 6 + (3675 / 8192) * e2 ** 8 107 | beta6 = -(35 / 96) * e2 ** 6 + (735 / 2048) * e2 ** 8 108 | beta8 = (315 / 1024) * e2 ** 8 109 | 110 | L = np.radians(longitude - central_longitude) 111 | B = np.radians(latitude) 112 | cosB = np.cos(B) 113 | sinB = np.sin(B) 114 | tanB = np.tan(B) 115 | N = a / np.sqrt(1 - (e1 * sinB) ** 2) 116 | eta = e2 * cosB 117 | m = L * cosB 118 | Yb0 = a ** 2 * B * beta0 / b + sinB * \ 119 | (beta2 * cosB + beta4 * cosB ** 3 + beta6 * cosB ** 5 + beta8 * cosB ** 7) 120 | Y = Yb0 + (1 / 2) * N * tanB * m ** 2 + (1 / 24) * (5 - tanB ** 2 + 9 * eta ** 2 + 4 * eta ** 4) * N * tanB * \ 121 | m ** 4 + (1 / 720) * (61 - 58 * tanB ** 2) * N * tanB * m ** 6 122 | X = N * m + (1 / 6) * (1 - tanB ** 2 + eta ** 2) * N * m ** 3 + \ 123 | (1 / 120) * (5 - 18 * tanB ** 2 + tanB ** 4 + 14 * eta ** 2 - 58 * tanB ** 2 * eta) * N * m ** 5 + 500000 124 | X = X.reshape(-1, 1) 125 | Y = Y.reshape(-1, 1) 126 | return X, Y 127 | 128 | 129 | def golden_section(a, c, delta, decimal, function, tol, max_iter, verbose=False): 130 | b = a + delta * np.abs(c - a) 131 | d = c - delta * np.abs(c - a) 132 | diff = 1.0e9 133 | iter_num = 0 134 | score_dict = {} 135 | opt_val = None 136 | while np.abs(diff) > tol and iter_num < max_iter: 137 | iter_num += 1 138 | b = np.round(b, decimal) 139 | d = np.round(d, decimal) 140 | 141 | if b in score_dict: 142 | score_b = score_dict[b] 143 | else: 144 | score_b = function(b) 145 | score_dict[b] = score_b 146 | 147 | if d in score_dict: 148 | score_d = score_dict[d] 149 | else: 150 | score_d = function(d) 151 | score_dict[d] = score_d 152 | 153 | if score_b <= score_d: 154 | opt_val = b 155 | opt_score = score_b 156 | c = d 157 | d = b 158 | b = a + delta * np.abs(c - a) 159 | 160 | else: 161 | opt_val = d 162 | opt_score = score_d 163 | a = b 164 | b = d 165 | d = c - delta * np.abs(c - a) 166 | 167 | opt_val = np.round(opt_val, decimal) 168 | diff = score_b - score_d 169 | if verbose: 170 | print('bw:', opt_val, ', score:', np.round(opt_score, 2)) 171 | 172 | return opt_val 173 | 174 | 175 | def onestep_golden_section(A, C, x, delta, tau_decimal, function, tol): 176 | iter_num = 0 177 | score_dict = {} 178 | diff = 1e9 179 | opt_score = None 180 | opt_tau = None 181 | B = A + delta * np.abs(C - A) 182 | D = C - delta * np.abs(C - A) 183 | while np.abs(diff) > tol and iter_num < 200: 184 | iter_num += 1 185 | B = np.round(B, tau_decimal) 186 | D = np.round(D, tau_decimal) 187 | if B in score_dict: 188 | score_B = score_dict[B] 189 | else: 190 | score_B = function(x, B) 191 | score_dict[B] = score_B 192 | 193 | if D in score_dict: 194 | score_D = score_dict[D] 195 | else: 196 | score_D = function(x, D) 197 | score_dict[D] = score_D 198 | if score_B <= score_D: 199 | opt_score = score_B 200 | opt_tau = B 201 | C = D 202 | D = B 203 | B = A + delta * np.abs(C - A) 204 | else: 205 | opt_score = score_D 206 | opt_tau = D 207 | A = B 208 | B = D 209 | D = C - delta * np.abs(C - A) 210 | diff = score_B - score_D 211 | return opt_tau, opt_score 212 | 213 | 214 | def twostep_golden_section( 215 | a, c, A, C, delta, function, 216 | tol, max_iter, bw_decimal, tau_decimal, verbose=False): 217 | b = a + delta * np.abs(c - a) 218 | d = c - delta * np.abs(c - a) 219 | opt_bw = None 220 | opt_tau = None 221 | diff = 1e9 222 | score_dict = {} 223 | iter_num = 0 224 | while np.abs(diff) > tol and iter_num < max_iter: 225 | iter_num += 1 226 | b = np.round(b, bw_decimal) 227 | d = np.round(d, bw_decimal) 228 | if b in score_dict: 229 | tau_b, score_b = score_dict[b] 230 | else: 231 | tau_b, score_b = onestep_golden_section(A, C, b, delta, tau_decimal, function, tol) 232 | score_dict[b] = [tau_b, score_b] 233 | if d in score_dict: 234 | tau_d, score_d = score_dict[d] 235 | else: 236 | tau_d, score_d = onestep_golden_section(A, C, d, delta, tau_decimal, function, tol) 237 | score_dict[d] = [tau_d, score_d] 238 | 239 | if score_b <= score_d: 240 | opt_score = score_b 241 | opt_bw = b 242 | opt_tau = tau_b 243 | c = d 244 | d = b 245 | b = a + delta * np.abs(c - a) 246 | else: 247 | opt_score = score_d 248 | opt_bw = d 249 | opt_tau = tau_d 250 | a = b 251 | b = d 252 | d = c - delta * np.abs(c - a) 253 | diff = score_b - score_d 254 | if verbose: 255 | print('bw: ', opt_bw, ', tau: ', opt_tau, ', score: ', opt_score) 256 | return opt_bw, opt_tau 257 | 258 | 259 | def multi_bw(init, X, y, n, k, tol, rss_score, gwr_func, 260 | bw_func, sel_func, multi_bw_min, multi_bw_max, bws_same_times, 261 | verbose=False): 262 | """ 263 | Multiscale GWR bandwidth search procedure using iterative GAM backfitting 264 | """ 265 | if init is None: 266 | bw = sel_func(bw_func(X, y)) 267 | optim_model = gwr_func(X, y, bw) 268 | else: 269 | bw = init 270 | optim_model = gwr_func(X, y, init) 271 | bw_gwr = bw 272 | err = optim_model.reside 273 | betas = optim_model.betas 274 | XB = np.multiply(betas, X) 275 | rss = np.sum(err ** 2) if rss_score else None 276 | scores = [] 277 | BWs = [] 278 | bw_stable_counter = 0 279 | bws = np.empty(k) 280 | Betas = None 281 | 282 | for iters in range(1, 201): 283 | new_XB = np.zeros_like(X) 284 | Betas = np.zeros_like(X) 285 | 286 | for j in range(k): 287 | temp_y = XB[:, j].reshape((-1, 1)) 288 | temp_y = temp_y + err 289 | temp_X = X[:, j].reshape((-1, 1)) 290 | bw_class = bw_func(temp_X, temp_y) 291 | 292 | if bw_stable_counter >= bws_same_times: 293 | # If in backfitting, all bws not changing in bws_same_times (default 5) iterations 294 | bw = bws[j] 295 | else: 296 | bw = sel_func(bw_class, multi_bw_min[j], multi_bw_max[j]) 297 | 298 | optim_model = gwr_func(temp_X, temp_y, bw) 299 | err = optim_model.reside 300 | betas = optim_model.betas 301 | new_XB[:, j] = optim_model.pre.reshape(-1) 302 | Betas[:, j] = betas.reshape(-1) 303 | bws[j] = bw 304 | 305 | # If bws remain the same as from previous iteration 306 | if (iters > 1) and np.all(BWs[-1] == bws): 307 | bw_stable_counter += 1 308 | else: 309 | bw_stable_counter = 0 310 | 311 | num = np.sum((new_XB - XB) ** 2) / n 312 | den = np.sum(np.sum(new_XB, axis=1) ** 2) 313 | score = (num / den) ** 0.5 314 | XB = new_XB 315 | 316 | if rss_score: 317 | predy = np.sum(np.multiply(betas, X), axis=1).reshape((-1, 1)) 318 | new_rss = np.sum((y - predy) ** 2) 319 | score = np.abs((new_rss - rss) / new_rss) 320 | rss = new_rss 321 | scores.append(deepcopy(score)) 322 | delta = score 323 | BWs.append(deepcopy(bws)) 324 | 325 | if verbose: 326 | print("Current iteration:", iters, ",SOC:", np.round(score, 7)) 327 | print("Bandwidths:", ', '.join([str(bw) for bw in bws])) 328 | 329 | if delta < tol: 330 | break 331 | 332 | opt_bw = BWs[-1] 333 | return opt_bw, np.array(BWs), np.array(scores), Betas, err, bw_gwr 334 | 335 | 336 | def multi_bws(init_bw, init_tau, X, y, n, k, tol, rss_score, 337 | gtwr_func, bw_func, sel_func, multi_bw_min, multi_bw_max, 338 | multi_tau_min, multi_tau_max, verbose=False): 339 | """ 340 | Multiscale GTWR bandwidth search procedure using iterative GAM back fitting 341 | """ 342 | if (init_bw is None) or (init_tau is None): 343 | bw, tau = sel_func(bw_func(X, y)) 344 | else: 345 | bw, tau = init_bw, init_tau 346 | opt_model = gtwr_func(X, y, bw, tau) 347 | bw_gtwr = bw 348 | tau_gtwr = tau 349 | err = opt_model.reside 350 | betas = opt_model.betas 351 | 352 | XB = np.multiply(betas, X) 353 | rss = np.sum(err ** 2) if rss_score else None 354 | scores = [] 355 | bws = np.empty(k) 356 | taus = np.empty(k) 357 | BWs = [] 358 | Taus = [] 359 | Betas = None 360 | 361 | for iter_num in range(1, 201): 362 | new_XB = np.zeros_like(X) 363 | Betas = np.zeros_like(X) 364 | 365 | for j in range(k): 366 | temp_y = XB[:, j].reshape((-1, 1)) 367 | temp_y = temp_y + err 368 | temp_X = X[:, j].reshape((-1, 1)) 369 | bw_class = bw_func(temp_X, temp_y) 370 | 371 | bw, tau = sel_func(bw_class, multi_bw_min[j], multi_bw_max[j], 372 | multi_tau_min[j], multi_tau_max[j]) 373 | 374 | opt_model = gtwr_func(temp_X, temp_y, bw, tau) 375 | err = opt_model.reside 376 | betas = opt_model.betas 377 | new_XB[:, j] = (betas * temp_X).reshape(-1) 378 | Betas[:, j] = betas.reshape(-1) 379 | bws[j] = bw 380 | taus[j] = tau 381 | 382 | num = np.sum((new_XB - XB) ** 2) / n 383 | den = np.sum(np.sum(new_XB, axis=1) ** 2) 384 | score = (num / den) ** 0.5 385 | XB = new_XB 386 | 387 | if rss_score: 388 | predy = np.sum(np.multiply(betas, X), axis=1).reshape((-1, 1)) 389 | new_rss = np.sum((y - predy) ** 2) 390 | score = np.abs((new_rss - rss) / new_rss) 391 | rss = new_rss 392 | scores.append(deepcopy(score)) 393 | delta = score 394 | BWs.append(deepcopy(bws)) 395 | Taus.append(deepcopy(taus)) 396 | 397 | if verbose: 398 | print("Current iteration:", iter_num, ",SOC:", np.round(score, 7)) 399 | print("Bandwidths:", ', '.join([str(bw) for bw in bws])) 400 | print("taus:", ','.join([str(tau) for tau in taus])) 401 | 402 | if delta < tol: 403 | break 404 | opt_bws = BWs[-1] 405 | opt_tau = Taus[-1] 406 | return (opt_bws, opt_tau, np.array(BWs), np.array(Taus), np.array(scores), 407 | Betas, err, bw_gtwr, tau_gtwr) 408 | -------------------------------------------------------------------------------- /mgtwr/kernel.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from scipy.spatial.distance import cdist 3 | 4 | 5 | class GWRKernel: 6 | 7 | def __init__( 8 | self, 9 | coords: np.ndarray, 10 | bw: float = None, 11 | fixed: bool = True, 12 | function: str = 'triangular', 13 | eps: float = 1.0000001): 14 | 15 | self.coords = coords 16 | self.function = function 17 | self.bw = bw 18 | self.fixed = fixed 19 | self.function = function 20 | self.eps = eps 21 | self.bandwidth = None 22 | self.kernel = None 23 | 24 | def cal_distance( 25 | self, 26 | i: int): 27 | distance = cdist([self.coords[i]], self.coords).reshape(-1) 28 | return distance 29 | 30 | def cal_kernel( 31 | self, 32 | distance 33 | ): 34 | 35 | if self.fixed: 36 | self.bandwidth = float(self.bw) 37 | else: 38 | self.bandwidth = np.partition( 39 | distance, 40 | int(self.bw) - 1)[int(self.bw) - 1] * self.eps # partial sort in O(n) Time 41 | 42 | self.kernel = self._kernel_funcs(distance / self.bandwidth) 43 | 44 | if self.function == "bisquare": # Truncate for bisquare 45 | self.kernel[(distance >= self.bandwidth)] = 0 46 | return self.kernel 47 | 48 | def _kernel_funcs(self, zs): 49 | # functions follow Anselin and Rey (2010) table 5.4 50 | if self.function == 'triangular': 51 | return 1 - zs 52 | elif self.function == 'uniform': 53 | return np.ones(zs.shape) * 0.5 54 | elif self.function == 'quadratic': 55 | return (3. / 4) * (1 - zs ** 2) 56 | elif self.function == 'quartic': 57 | return (15. / 16) * (1 - zs ** 2) ** 2 58 | elif self.function == 'gaussian': 59 | return np.exp(-0.5 * zs ** 2) 60 | elif self.function == 'bisquare': 61 | return (1 - zs ** 2) ** 2 62 | elif self.function == 'exponential': 63 | return np.exp(-zs) 64 | else: 65 | print('Unsupported kernel function', self.function) 66 | 67 | 68 | class GTWRKernel(GWRKernel): 69 | 70 | def __init__( 71 | self, 72 | coords: np.ndarray, 73 | t: np.ndarray, 74 | bw: float = None, 75 | tau: float = None, 76 | fixed: bool = True, 77 | function: str = 'triangular', 78 | eps: float = 1.0000001): 79 | 80 | super(GTWRKernel, self).__init__(coords, bw, fixed=fixed, function=function, eps=eps) 81 | 82 | self.t = t 83 | self.tau = tau 84 | self.coords_new = None 85 | 86 | def cal_distance( 87 | self, 88 | i: int): 89 | 90 | if self.tau == 0: 91 | self.coords_new = self.coords 92 | else: 93 | self.coords_new = np.hstack([self.coords, (np.sqrt(self.tau) * self.t)]) 94 | distance = cdist([self.coords_new[i]], self.coords_new).reshape(-1) 95 | return distance 96 | -------------------------------------------------------------------------------- /mgtwr/model.py: -------------------------------------------------------------------------------- 1 | from typing import Union 2 | import numpy as np 3 | import pandas as pd 4 | import multiprocessing as mp 5 | from .kernel import GWRKernel, GTWRKernel 6 | from .function import _compute_betas_gwr, surface_to_plane 7 | from .obj import CalAicObj, CalMultiObj, BaseModel, GWRResults, GTWRResults, MGWRResults, MGTWRResults 8 | from joblib import Parallel, delayed 9 | 10 | 11 | class GWR(BaseModel): 12 | """ 13 | Geographically Weighted Regression 14 | """ 15 | def __init__( 16 | self, 17 | coords: Union[np.ndarray, pd.DataFrame], 18 | X: Union[np.ndarray, pd.DataFrame], 19 | y: Union[np.ndarray, pd.DataFrame, pd.Series], 20 | bw: float, 21 | kernel: str = 'bisquare', 22 | fixed: bool = True, 23 | constant: bool = True, 24 | thread: int = 1, 25 | convert: bool = False, 26 | ): 27 | """ 28 | Parameters 29 | ---------- 30 | coords : array-like 31 | n*2, spatial coordinates of the observations, if it's latitude and longitude, 32 | the first column should be longitude 33 | 34 | X : array-like 35 | n*k, independent variable, excluding the constant 36 | 37 | y : array-like 38 | n*1, dependent variable 39 | 40 | bw : scalar 41 | bandwidth value consisting of either a distance or N 42 | nearest neighbors; user specified or obtained using 43 | sel 44 | 45 | kernel : string 46 | type of kernel function used to weight observations; 47 | available options: 48 | 'gaussian' 49 | 'bisquare' 50 | 'exponential' 51 | 52 | fixed : bool 53 | True for distance based kernel function (default) and 54 | False for adaptive (nearest neighbor) kernel function 55 | 56 | constant : bool 57 | True to include intercept (default) in model and False to exclude 58 | intercept. 59 | 60 | thread : int 61 | The number of processes in parallel computation. If you have a large amount of data, 62 | you can use it 63 | 64 | convert : bool 65 | Whether to convert latitude and longitude to plane coordinates. 66 | Examples 67 | -------- 68 | import numpy as np 69 | from mgtwr.model import GWR 70 | np.random.seed(10) 71 | u = np.array([(i-1) % 12 for i in range(1, 1729)]).reshape(-1, 1) 72 | v = np.array([((i-1) % 144) // 12 for i in range(1, 1729)]).reshape(-1, 1) 73 | t = np.array([(i-1) // 144 for i in range(1, 1729)]).reshape(-1, 1) 74 | x1 = np.random.uniform(0, 1, (1728, 1)) 75 | x2 = np.random.uniform(0, 1, (1728, 1)) 76 | epsilon = np.random.randn(1728, 1) 77 | beta0 = 5 78 | beta1 = 3 + (u + v + t)/6 79 | beta2 = 3 + ((36-(6-u)**2)*(36-(6-v)**2)*(36-(6-t)**2)) / 128 80 | y = beta0 + beta1 * x1 + beta2 * x2 + epsilon 81 | coords = np.hstack([u, v]) 82 | X = np.hstack([x1, x2]) 83 | gwr = GWR(coords, X, y, 0.8, kernel='gaussian', fixed=True).fit() 84 | print(gwr.R2) 85 | 0.7128737240047688 86 | """ 87 | super(GWR, self).__init__(X, y, kernel, fixed, constant) 88 | if thread < 1 or not isinstance(thread, int): 89 | raise ValueError('thread should be an integer greater than or equal to 1') 90 | if isinstance(coords, pd.DataFrame): 91 | coords = coords.values 92 | self.coords = coords 93 | if convert: 94 | longitude = coords[:, 0] 95 | latitude = coords[:, 1] 96 | longitude, latitude = surface_to_plane(longitude, latitude) 97 | self.coords = np.hstack([longitude, latitude]) 98 | self.bw = bw 99 | self.thread = thread 100 | 101 | def _build_wi(self, i, bw): 102 | """ 103 | calculate Weight matrix 104 | """ 105 | try: 106 | gwr_kernel = GWRKernel(self.coords, bw, fixed=self.fixed, function=self.kernel) 107 | distance = gwr_kernel.cal_distance(i) 108 | wi = gwr_kernel.cal_kernel(distance) 109 | except BaseException: 110 | raise # TypeError('Unsupported kernel function ', kernel) 111 | 112 | return wi 113 | 114 | def cal_aic(self): 115 | """ 116 | use for calculating AICc, BIC, CV and so on. 117 | """ 118 | if self.thread > 1: 119 | result = list(zip(*Parallel(n_jobs=self.thread,pre_dispatch='all',batch_size=self.n//self.thread)(delayed(self._search_local_fit)(i) for i in range(self.n)))) 120 | else: 121 | result = list(zip(*map(self._search_local_fit, range(self.n)))) 122 | err2 = np.array(result[0]).reshape(-1, 1) 123 | hat = np.array(result[1]).reshape(-1, 1) 124 | aa = np.sum(err2 / ((1 - hat) ** 2)) 125 | RSS = np.sum(err2) 126 | tr_S = np.sum(hat) 127 | llf = -np.log(RSS) * self.n / 2 - (1 + np.log(np.pi / self.n * 2)) * self.n / 2 128 | 129 | return CalAicObj(tr_S, float(llf), float(aa), self.n) 130 | 131 | def _search_local_fit(self, i): 132 | wi = self._build_wi(i, self.bw).reshape(-1, 1) 133 | betas, inv_xtx_xt = _compute_betas_gwr(self.y, self.X, wi) 134 | predict = np.dot(self.X[i], betas)[0] 135 | reside = self.y[i] - predict 136 | influx = np.dot(self.X[i], inv_xtx_xt[:, i]) 137 | return reside * reside, influx 138 | 139 | def _local_fit(self, i): 140 | wi = self._build_wi(i, self.bw).reshape(-1, 1) 141 | betas, inv_xtx_xt = _compute_betas_gwr(self.y, self.X, wi) 142 | predict = np.dot(self.X[i], betas)[0] 143 | reside = self.y[i] - predict 144 | influx = np.dot(self.X[i], inv_xtx_xt[:, i]) 145 | Si = np.dot(self.X[i], inv_xtx_xt).reshape(-1) 146 | CCT = np.diag(np.dot(inv_xtx_xt, inv_xtx_xt.T)).reshape(-1) 147 | Si2 = np.sum(Si ** 2) 148 | return influx, reside, predict, betas.reshape(-1), CCT, Si2 149 | 150 | def _multi_fit(self, i): 151 | wi = self._build_wi(i, self.bw).reshape(-1, 1) 152 | betas, inv_xtx_xt = _compute_betas_gwr(self.y, self.X, wi) 153 | pre = np.dot(self.X[i], betas)[0] 154 | reside = self.y[i] - pre 155 | return betas.reshape(-1), pre, reside 156 | 157 | def cal_multi(self): 158 | """ 159 | calculate betas, predict value and reside, use for searching best bandwidth in MGWR model by backfitting. 160 | """ 161 | if self.thread > 1: 162 | result = list(zip(*Parallel(n_jobs=self.thread,pre_dispatch='all',batch_size=self.n//self.thread)(delayed(self._multi_fit)(i) for i in range(self.n)))) 163 | else: 164 | result = list(zip(*map(self._multi_fit, range(self.n)))) 165 | betas = np.array(result[0]) 166 | pre = np.array(result[1]).reshape(-1, 1) 167 | reside = np.array(result[2]).reshape(-1, 1) 168 | return CalMultiObj(betas, pre, reside) 169 | 170 | def fit(self): 171 | """ 172 | To fit GWR model 173 | """ 174 | if self.thread > 1: 175 | result = list(zip(*Parallel(n_jobs=self.thread,pre_dispatch='all',batch_size=self.n//self.thread)(delayed(self._local_fit)(i) for i in range(self.n)))) 176 | else: 177 | result = list(zip(*map(self._local_fit, range(self.n)))) 178 | influ = np.array(result[0]).reshape(-1, 1) 179 | reside = np.array(result[1]).reshape(-1, 1) 180 | predict_value = np.array(result[2]).reshape(-1, 1) 181 | betas = np.array(result[3]) 182 | CCT = np.array(result[4]) 183 | tr_STS = np.array(result[5]) 184 | return GWRResults(self.coords, self.X, self.y, self.bw, self.kernel, self.fixed, 185 | influ, reside, predict_value, betas, CCT, tr_STS) 186 | 187 | 188 | class MGWR(GWR): 189 | """ 190 | Multiscale Geographically Weighted Regression 191 | """ 192 | def __init__( 193 | self, 194 | coords: np.ndarray, 195 | X: np.ndarray, 196 | y: np.ndarray, 197 | selector, 198 | kernel: str = 'bisquare', 199 | fixed: bool = False, 200 | constant: bool = True, 201 | thread: int = 1, 202 | convert: bool = False 203 | ): 204 | """ 205 | Parameters 206 | ---------- 207 | coords : array-like 208 | n*2, spatial coordinates of the observations, if it's latitude and longitude, 209 | the first column should be longitude 210 | 211 | X : array-like 212 | n*k, independent variable, excluding the constant 213 | 214 | y : array-like 215 | n*1, dependent variable 216 | 217 | selector :SearchMGWRParameter object 218 | valid SearchMGWRParameter that has successfully called 219 | the "search" method. This parameter passes on 220 | information from GAM model estimation including optimal 221 | bandwidths. 222 | 223 | kernel : string 224 | type of kernel function used to weight observations; 225 | available options: 226 | 'gaussian' 227 | 'bisquare' 228 | 'exponential' 229 | 230 | fixed : bool 231 | True for distance based kernel function (default) and False for 232 | adaptive (nearest neighbor) kernel function 233 | 234 | constant : bool 235 | True to include intercept (default) in model and False to exclude 236 | intercept. 237 | 238 | thread : int 239 | The number of processes in parallel computation. If you have a large amount of data, 240 | you can use it 241 | 242 | convert : bool 243 | Whether to convert latitude and longitude to plane coordinates. 244 | Examples 245 | -------- 246 | import numpy as np 247 | from mgtwr.sel import SearchMGWRParameter 248 | from mgtwr.model import MGWR 249 | np.random.seed(10) 250 | u = np.array([(i-1) % 12 for i in range(1, 1729)]).reshape(-1, 1) 251 | v = np.array([((i-1) % 144) // 12 for i in range(1, 1729)]).reshape(-1, 1) 252 | t = np.array([(i-1) // 144 for i in range(1, 1729)]).reshape(-1, 1) 253 | x1 = np.random.uniform(0, 1, (1728, 1)) 254 | x2 = np.random.uniform(0, 1, (1728, 1)) 255 | epsilon = np.random.randn(1728, 1) 256 | beta0 = 5 257 | beta1 = 3 + (u + v + t)/6 258 | beta2 = 3 + ((36-(6-u)**2)*(36-(6-v)**2)*(36-(6-t)**2)) / 128 259 | y = beta0 + beta1 * x1 + beta2 * x2 + epsilon 260 | coords = np.hstack([u, v]) 261 | X = np.hstack([x1, x2]) 262 | sel_multi = SearchMGWRParameter(coords, X, y, kernel='gaussian', fixed=True) 263 | bws = sel_multi.search(multi_bw_max=[40], verbose=True) 264 | mgwr = MGWR(coords, X, y, sel_multi, kernel='gaussian', fixed=True).fit() 265 | print(mgwr.R2) 266 | 0.7045642214972343 267 | """ 268 | self.selector = selector 269 | self.bws = self.selector.bws[0] # final set of bandwidth 270 | self.bws_history = selector.bws[1] # bws history in back_fitting 271 | self.betas = selector.bws[3] 272 | bw_init = self.selector.bws[5] # initialization bandwidth 273 | super().__init__( 274 | coords, X, y, bw_init, kernel=kernel, fixed=fixed, constant=constant, thread=thread, convert=convert) 275 | self.n_chunks = None 276 | self.ENP_j = None 277 | 278 | def _chunk_compute(self, chunk_id=0): 279 | n = self.n 280 | k = self.k 281 | n_chunks = self.n_chunks 282 | chunk_size = int(np.ceil(float(n / n_chunks))) 283 | ENP_j = np.zeros(self.k) 284 | CCT = np.zeros((self.n, self.k)) 285 | 286 | chunk_index = np.arange(n)[chunk_id * chunk_size:(chunk_id + 1) * chunk_size] 287 | init_pR = np.zeros((n, len(chunk_index))) 288 | init_pR[chunk_index, :] = np.eye(len(chunk_index)) 289 | pR = np.zeros((n, len(chunk_index), 290 | k)) # partial R: n by chunk_size by k 291 | 292 | for i in range(n): 293 | wi = self._build_wi(i, self.bw).reshape(-1, 1) 294 | xT = (self.X * wi).T 295 | P = np.linalg.solve(xT.dot(self.X), xT).dot(init_pR).T 296 | pR[i, :, :] = P * self.X[i] 297 | 298 | err = init_pR - np.sum(pR, axis=2) # n by chunk_size 299 | 300 | for iter_i in range(self.bws_history.shape[0]): 301 | for j in range(k): 302 | pRj_old = pR[:, :, j] + err 303 | Xj = self.X[:, j] 304 | n_chunks_Aj = n_chunks 305 | chunk_size_Aj = int(np.ceil(float(n / n_chunks_Aj))) 306 | for chunk_Aj in range(n_chunks_Aj): 307 | chunk_index_Aj = np.arange(n)[chunk_Aj * chunk_size_Aj:( 308 | chunk_Aj + 1) * chunk_size_Aj] 309 | pAj = np.empty((len(chunk_index_Aj), n)) 310 | for i in range(len(chunk_index_Aj)): 311 | index = chunk_index_Aj[i] 312 | wi = self._build_wi(index, self.bws_history[iter_i, j]) 313 | xw = Xj * wi 314 | pAj[i, :] = Xj[index] / np.sum(xw * Xj) * xw 315 | pR[chunk_index_Aj, :, j] = pAj.dot(pRj_old) 316 | err = pRj_old - pR[:, :, j] 317 | 318 | for j in range(k): 319 | CCT[:, j] += ((pR[:, :, j] / self.X[:, j].reshape(-1, 1)) ** 2).sum( 320 | axis=1) 321 | for i in range(len(chunk_index)): 322 | ENP_j += pR[chunk_index[i], i, :] 323 | 324 | return ENP_j, CCT, 325 | 326 | def fit(self, n_chunks: int = 1, skip_calculate: bool = False): 327 | """ 328 | Compute MGWR inference by chunk to reduce memory footprint. 329 | Parameters 330 | ---------- 331 | n_chunks : int 332 | divided into n_chunks steps to reduce memory consumption 333 | skip_calculate : bool 334 | if True, skip calculate CCT, ENP and other variables derived from it 335 | """ 336 | pre = np.sum(self.X * self.betas, axis=1).reshape(-1, 1) 337 | ENP_j = None 338 | CCT = None 339 | if not skip_calculate: 340 | self.n_chunks = n_chunks 341 | result = map(self._chunk_compute, (range(n_chunks))) 342 | result_list = list(zip(*result)) 343 | ENP_j = np.sum(np.array(result_list[0]), axis=0) 344 | CCT = np.sum(np.array(result_list[1]), axis=0) 345 | return MGWRResults( 346 | self.coords, self.X, self.y, self.bws, self.kernel, self.fixed, 347 | self.bws_history, self.betas, pre, ENP_j, CCT) 348 | 349 | 350 | class GTWR(BaseModel): 351 | """ 352 | Geographically and Temporally Weighted Regression 353 | 354 | Parameters 355 | ---------- 356 | coords : array-like 357 | n*2, collection of n sets of (x,y) coordinates of 358 | observations 359 | 360 | t : array-like 361 | n*1, time location 362 | 363 | X : array-like 364 | n*k, independent variable, excluding the constant 365 | 366 | y : array-like 367 | n*1, dependent variable 368 | 369 | bw : scalar 370 | bandwidth value consisting of either a distance or N 371 | nearest neighbors; user specified or obtained using 372 | sel 373 | 374 | tau : scalar 375 | spatio-temporal scale 376 | 377 | kernel : string 378 | type of kernel function used to weight observations; 379 | available options: 380 | 'gaussian' 381 | 'bisquare' 382 | 'exponential' 383 | 384 | fixed : bool 385 | True for distance based kernel function (default) and 386 | False for adaptive (nearest neighbor) kernel function 387 | 388 | constant : bool 389 | True to include intercept (default) in model and False to exclude 390 | intercept. 391 | 392 | Examples 393 | -------- 394 | import numpy as np 395 | from mgtwr.model import GTWR 396 | np.random.seed(10) 397 | u = np.array([(i-1) % 12 for i in range(1, 1729)]).reshape(-1, 1) 398 | v = np.array([((i-1) % 144) // 12 for i in range(1, 1729)]).reshape(-1, 1) 399 | t = np.array([(i-1) // 144 for i in range(1, 1729)]).reshape(-1, 1) 400 | x1 = np.random.uniform(0, 1, (1728, 1)) 401 | x2 = np.random.uniform(0, 1, (1728, 1)) 402 | epsilon = np.random.randn(1728, 1) 403 | beta0 = 5 404 | beta1 = 3 + (u + v + t)/6 405 | beta2 = 3 + ((36-(6-u)**2)*(36-(6-v)**2)*(36-(6-t)**2)) / 128 406 | y = beta0 + beta1 * x1 + beta2 * x2 + epsilon 407 | coords = np.hstack([u, v]) 408 | X = np.hstack([x1, x2]) 409 | gtwr = GTWR(coords, t, X, y, 0.8, 1.9, kernel='gaussian', fixed=True).fit() 410 | print(gtwr.R2) 411 | 0.9899869616636376 412 | """ 413 | 414 | def __init__( 415 | self, 416 | coords: Union[np.ndarray, pd.DataFrame], 417 | t: Union[np.ndarray, pd.DataFrame], 418 | X: Union[np.ndarray, pd.DataFrame], 419 | y: Union[np.ndarray, pd.DataFrame], 420 | bw: float, 421 | tau: float, 422 | kernel: str = 'gaussian', 423 | fixed: bool = False, 424 | constant: bool = True, 425 | thread: int = 1, 426 | convert: bool = False 427 | ): 428 | super(GTWR, self).__init__(X, y, kernel, fixed, constant) 429 | if thread < 1 or not isinstance(thread, int): 430 | raise ValueError('thread should be an integer greater than or equal to 1') 431 | if isinstance(coords, pd.DataFrame): 432 | coords = coords.values 433 | self.coords = coords 434 | if convert: 435 | longitude = coords[:, 0] 436 | latitude = coords[:, 1] 437 | longitude, latitude = surface_to_plane(longitude, latitude) 438 | self.coords = np.hstack([longitude, latitude]) 439 | self.t = t 440 | self.bw = bw 441 | self.tau = tau 442 | self.bw_s = self.bw 443 | self.bw_t = np.sqrt(self.bw ** 2 / self.tau) 444 | self.thread = thread 445 | 446 | def _build_wi(self, i, bw, tau): 447 | """ 448 | calculate Weight matrix 449 | """ 450 | try: 451 | gtwr_kernel = GTWRKernel(self.coords, self.t, bw, tau, fixed=self.fixed, function=self.kernel) 452 | distance = gtwr_kernel.cal_distance(i) 453 | wi = gtwr_kernel.cal_kernel(distance) 454 | except BaseException: 455 | raise # TypeError('Unsupported kernel function ', kernel) 456 | 457 | return wi 458 | 459 | def cal_aic(self): 460 | 461 | """ 462 | use for calculating AICc, BIC, CV and so on. 463 | """ 464 | if self.thread > 1: 465 | result = list(zip(*Parallel(n_jobs=self.thread,pre_dispatch='all',batch_size=self.n//self.thread)(delayed(self._search_local_fit)(i) for i in range(self.n)))) 466 | else: 467 | result = list(zip(*map(self._search_local_fit, range(self.n)))) 468 | err2 = np.array(result[0]).reshape(-1, 1) 469 | hat = np.array(result[1]).reshape(-1, 1) 470 | aa = np.sum(err2 / ((1 - hat) ** 2)) 471 | RSS = np.sum(err2) 472 | tr_S = np.sum(hat) 473 | llf = -np.log(RSS) * self.n / 2 - (1 + np.log(np.pi / self.n * 2)) * self.n / 2 474 | 475 | return CalAicObj(tr_S, float(llf), float(aa), self.n) 476 | 477 | def _search_local_fit(self, i): 478 | wi = self._build_wi(i, self.bw, self.tau).reshape(-1, 1) 479 | betas, xtx_inv_xt = _compute_betas_gwr(self.y, self.X, wi) 480 | predict = np.dot(self.X[i], betas)[0] 481 | reside = self.y[i] - predict 482 | influ = np.dot(self.X[i], xtx_inv_xt[:, i]) 483 | return reside * reside, influ 484 | 485 | def _local_fit(self, i): 486 | wi = self._build_wi(i, self.bw, self.tau).reshape(-1, 1) 487 | betas, xtx_inv_xt = _compute_betas_gwr(self.y, self.X, wi) 488 | predict = np.dot(self.X[i], betas)[0] 489 | reside = self.y[i] - predict 490 | influ = np.dot(self.X[i], xtx_inv_xt[:, i]) 491 | Si = np.dot(self.X[i], xtx_inv_xt).reshape(-1) 492 | CCT = np.diag(np.dot(xtx_inv_xt, xtx_inv_xt.T)).reshape(-1) 493 | Si2 = np.sum(Si ** 2) 494 | return influ, reside, predict, betas.reshape(-1), CCT, Si2 495 | 496 | def _multi_fit(self, i): 497 | wi = self._build_wi(i, self.bw, self.tau).reshape(-1, 1) 498 | betas, inv_xtx_xt = _compute_betas_gwr(self.y, self.X, wi) 499 | pre = np.dot(self.X[i], betas)[0] 500 | reside = self.y[i] - pre 501 | return betas.reshape(-1), pre, reside 502 | 503 | def cal_multi(self): 504 | """ 505 | calculate betas, predict value and reside, use for searching best bandwidth in MGWR model by backfitting. 506 | """ 507 | if self.thread > 1: 508 | result = list(zip(*Parallel(n_jobs=self.thread,pre_dispatch='all',batch_size=self.n//self.thread)(delayed(self._multi_fit)(i) for i in range(self.n)))) 509 | else: 510 | result = list(zip(*map(self._multi_fit, range(self.n)))) 511 | betas = np.array(result[0]) 512 | pre = np.array(result[1]).reshape(-1, 1) 513 | reside = np.array(result[2]).reshape(-1, 1) 514 | return CalMultiObj(betas, pre, reside) 515 | 516 | def fit(self): 517 | """ 518 | fit GTWR models 519 | 520 | """ 521 | if self.thread > 1: 522 | result = list(zip(*Parallel(n_jobs=self.thread,pre_dispatch='all',batch_size=self.n//self.thread)(delayed(self._local_fit)(i) for i in range(self.n)))) 523 | else: 524 | result = list(zip(*map(self._local_fit, range(self.n)))) 525 | influ = np.array(result[0]).reshape(-1, 1) 526 | reside = np.array(result[1]).reshape(-1, 1) 527 | predict_value = np.array(result[2]).reshape(-1, 1) 528 | betas = np.array(result[3]) 529 | CCT = np.array(result[4]) 530 | tr_STS = np.array(result[5]) 531 | return GTWRResults( 532 | self.coords, self.t, self.X, self.y, self.bw, self.tau, self.kernel, self.fixed, 533 | influ, reside, predict_value, betas, CCT, tr_STS 534 | ) 535 | 536 | 537 | class MGTWR(GTWR): 538 | """ 539 | Multiscale GTWR estimation and inference. 540 | 541 | Parameters 542 | ---------- 543 | coords : array-like 544 | n*2, collection of n sets of (x,y) coordinates of 545 | observatons 546 | 547 | t : array 548 | n*1, time location 549 | 550 | X : array-like 551 | n*k, independent variable, excluding the constant 552 | 553 | y : array-like 554 | n*1, dependent variable 555 | 556 | selector : SearchMGTWRParameter object 557 | valid SearchMGTWRParameter object that has successfully called 558 | the "search" method. This parameter passes on 559 | information from GAM model estimation including optimal 560 | bandwidths. 561 | 562 | kernel : string 563 | type of kernel function used to weight observations; 564 | available options: 565 | 'gaussian' 566 | 'bisquare' 567 | 'exponential' 568 | 569 | fixed : bool 570 | True for distance based kernel function (default) and False for 571 | adaptive (nearest neighbor) kernel function 572 | 573 | constant : bool 574 | True to include intercept (default) in model and False to exclude 575 | intercept. 576 | Examples 577 | -------- 578 | import numpy as np 579 | from mgtwr.sel import SearchMGTWRParameter 580 | from mgtwr.model import MGTWR 581 | np.random.seed(10) 582 | u = np.array([(i-1) % 12 for i in range(1, 1729)]).reshape(-1, 1) 583 | v = np.array([((i-1) % 144)//12 for i in range(1, 1729)]).reshape(-1, 1) 584 | t = np.array([(i-1) // 144 for i in range(1, 1729)]).reshape(-1, 1) 585 | x1 = np.random.uniform(0, 1, (1728, 1)) 586 | x2 = np.random.uniform(0, 1, (1728, 1)) 587 | epsilon = np.random.randn(1728, 1) 588 | beta0 = 5 589 | beta1 = 3 + (u + v + t)/6 590 | beta2 = 3 + ((36-(6-u)**2)*(36-(6-v)**2)*(36-(6-t)**2)) / 128 591 | y = beta0 + beta1 * x1 + beta2 * x2 + epsilon 592 | coords = np.hstack([u, v]) 593 | X = np.hstack([x1, x2]) 594 | sel_multi = SearchMGTWRParameter(coords, t, X, y, kernel='gaussian', fixed=True) 595 | bws = sel_multi.search(multi_bw_min=[0.1], verbose=True, tol_multi=1.0e-4) 596 | mgtwr = MGTWR(coords, t, X, y, sel_multi, kernel='gaussian', fixed=True).fit() 597 | print(mgtwr.R2) 598 | 0.9972924820674222 599 | """ 600 | 601 | def __init__( 602 | self, 603 | coords: np.ndarray, 604 | t: np.ndarray, 605 | X: np.ndarray, 606 | y: np.ndarray, 607 | selector, 608 | kernel: str = 'bisquare', 609 | fixed: bool = False, 610 | constant: bool = True, 611 | thread: int = 1, 612 | convert: bool = False 613 | ): 614 | self.selector = selector 615 | self.bws = self.selector.bws[0] # final set of bandwidth 616 | self.taus = self.selector.bws[1] 617 | self.bw_ts = np.sqrt(self.bws ** 2 / self.taus) 618 | self.bws_history = selector.bws[2] # bws history in back_fitting 619 | self.taus_history = selector.bws[3] 620 | self.betas = selector.bws[5] 621 | bw_init = self.selector.bws[7] # initialization bandwidth 622 | tau_init = self.selector.bws[8] 623 | super().__init__(coords, t, X, y, bw_init, tau_init, 624 | kernel=kernel, fixed=fixed, constant=constant, thread=thread, convert=convert) 625 | self.n_chunks = None 626 | self.ENP_j = None 627 | 628 | def _chunk_compute(self, chunk_id=0): 629 | n = self.n 630 | k = self.k 631 | n_chunks = self.n_chunks 632 | chunk_size = int(np.ceil(float(n / n_chunks))) 633 | ENP_j = np.zeros(self.k) 634 | CCT = np.zeros((self.n, self.k)) 635 | 636 | chunk_index = np.arange(n)[chunk_id * chunk_size:(chunk_id + 1) * chunk_size] 637 | init_pR = np.zeros((n, len(chunk_index))) 638 | init_pR[chunk_index, :] = np.eye(len(chunk_index)) 639 | pR = np.zeros((n, len(chunk_index), 640 | k)) # partial R: n by chunk_size by k 641 | 642 | for i in range(n): 643 | wi = self._build_wi(i, self.bw, self.tau).reshape(-1, 1) 644 | xT = (self.X * wi).T 645 | P = np.linalg.solve(xT.dot(self.X), xT).dot(init_pR).T 646 | pR[i, :, :] = P * self.X[i] 647 | 648 | err = init_pR - np.sum(pR, axis=2) # n by chunk_size 649 | 650 | for iter_i in range(self.bws_history.shape[0]): 651 | for j in range(k): 652 | pRj_old = pR[:, :, j] + err 653 | Xj = self.X[:, j] 654 | n_chunks_Aj = n_chunks 655 | chunk_size_Aj = int(np.ceil(float(n / n_chunks_Aj))) 656 | for chunk_Aj in range(n_chunks_Aj): 657 | chunk_index_Aj = np.arange(n)[chunk_Aj * chunk_size_Aj:( 658 | chunk_Aj + 1) * chunk_size_Aj] 659 | pAj = np.empty((len(chunk_index_Aj), n)) 660 | for i in range(len(chunk_index_Aj)): 661 | index = chunk_index_Aj[i] 662 | wi = self._build_wi(index, self.bws_history[iter_i, j], 663 | self.taus_history[iter_i, j]) 664 | xw = Xj * wi 665 | pAj[i, :] = Xj[index] / np.sum(xw * Xj) * xw 666 | pR[chunk_index_Aj, :, j] = pAj.dot(pRj_old) 667 | err = pRj_old - pR[:, :, j] 668 | 669 | for j in range(k): 670 | CCT[:, j] += ((pR[:, :, j] / self.X[:, j].reshape(-1, 1)) ** 2).sum( 671 | axis=1) 672 | for i in range(len(chunk_index)): 673 | ENP_j += pR[chunk_index[i], i, :] 674 | 675 | return ENP_j, CCT, 676 | 677 | def fit(self, n_chunks: int = 1, skip_calculate: bool = False): 678 | """ 679 | Compute MGTWR inference by chunk to reduce memory footprint. 680 | Parameters 681 | ---------- 682 | n_chunks : int 683 | divided into n_chunks steps to reduce memory consumption 684 | skip_calculate : bool 685 | if True, skip calculate CCT, ENP and other variables derived from it 686 | """ 687 | pre = np.sum(self.X * self.betas, axis=1).reshape(-1, 1) 688 | ENP_j = None 689 | CCT = None 690 | if not skip_calculate: 691 | self.n_chunks = n_chunks 692 | result = map(self._chunk_compute, (range(n_chunks))) 693 | result_list = list(zip(*result)) 694 | ENP_j = np.sum(np.array(result_list[0]), axis=0) 695 | CCT = np.sum(np.array(result_list[1]), axis=0) 696 | return MGTWRResults( 697 | self.coords, self.t, self.X, self.y, self.bws, self.taus, self.kernel, self.fixed, self.bw_ts, 698 | self.bws_history, self.taus_history, self.betas, pre, ENP_j, CCT) 699 | -------------------------------------------------------------------------------- /mgtwr/obj.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import pandas as pd 3 | from typing import Union 4 | 5 | 6 | class CalAicObj: 7 | 8 | def __init__(self, tr_S, llf, aa, n): 9 | self.tr_S = tr_S 10 | self.llf = llf 11 | self.aa = aa 12 | self.n = n 13 | 14 | 15 | class CalMultiObj: 16 | 17 | def __init__(self, betas, pre, reside): 18 | self.betas = betas 19 | self.pre = pre 20 | self.reside = reside 21 | 22 | 23 | class BaseModel: 24 | """ 25 | Is the parent class of most models 26 | """ 27 | def __init__( 28 | self, 29 | X: Union[np.ndarray, pd.DataFrame, pd.Series], 30 | y: Union[np.ndarray, pd.DataFrame, pd.Series], 31 | kernel: str, 32 | fixed: bool, 33 | constant: bool, 34 | ): 35 | self.X = X.values if isinstance(X, (pd.DataFrame, pd.Series)) else X 36 | self.y = y.values if isinstance(y, (pd.DataFrame, pd.Series)) else y 37 | if len(y.shape) > 1 and y.shape[1] != 1: 38 | raise ValueError('Label should be one-dimensional arrays') 39 | if len(y.shape) == 1: 40 | self.y = self.y.reshape(-1, 1) 41 | self.kernel = kernel 42 | self.fixed = fixed 43 | self.constant = constant 44 | self.n = X.shape[0] 45 | if self.constant: 46 | if len(self.X.shape) == 1 and np.all(self.X == 1): 47 | raise ValueError("You've already passed in a constant sequence, use constant=False instead") 48 | for j in range(self.X.shape[1]): 49 | if np.all(self.X[:, j] == 1): 50 | raise ValueError("You've already passed in a constant sequence, use constant=False instead") 51 | self.X = np.hstack([np.ones((self.n, 1)), X]) 52 | self.k = self.X.shape[1] 53 | 54 | 55 | class Results(BaseModel): 56 | """ 57 | Is the result parent class of all models 58 | """ 59 | 60 | def __init__( 61 | self, 62 | X: Union[np.ndarray, pd.DataFrame], 63 | y: Union[np.ndarray, pd.Series], 64 | kernel: str, 65 | fixed: bool, 66 | influ: np.ndarray, 67 | reside, 68 | predict_value: np.ndarray, 69 | betas: np.ndarray, 70 | tr_STS: float 71 | ): 72 | super(Results, self).__init__(X, y, kernel, fixed, constant=False) 73 | self.influ = influ 74 | self.reside = reside 75 | self.predict_value = predict_value 76 | self.betas = betas 77 | self.tr_S = np.sum(influ) 78 | self.ENP = self.tr_S 79 | self.tr_STS = tr_STS 80 | self.TSS = np.sum((y - np.mean(y)) ** 2) 81 | self.RSS = np.sum(reside ** 2) 82 | self.sigma2 = self.RSS / (self.n - self.tr_S) 83 | self.std_res = self.reside / (np.sqrt(self.sigma2 * (1.0 - self.influ))) 84 | self.cooksD = self.std_res ** 2 * self.influ / (self.tr_S * (1.0 - self.influ)) 85 | self.df_model = self.n - self.tr_S 86 | self.df_reside = self.n - 2.0 * self.tr_S + self.tr_STS 87 | self.R2 = 1 - self.RSS / self.TSS 88 | self.adj_R2 = 1 - (1 - self.R2) * (self.n - 1) / (self.n - self.ENP - 1) 89 | self.llf = -np.log(self.RSS) * self.n / 2 - (1 + np.log(np.pi / self.n * 2)) * self.n / 2 90 | self.aic = -2.0 * self.llf + 2.0 * (self.tr_S + 1) 91 | self.aicc = self.aic + 2.0 * self.tr_S * (self.tr_S + 1.0) / (self.n - self.tr_S - 1.0) 92 | self.bic = -2.0 * self.llf + (self.k + 1) * np.log(self.n) 93 | 94 | 95 | class GWRResults(Results): 96 | 97 | def __init__( 98 | self, coords, X, y, bw, kernel, fixed, influ, reside, predict_value, betas, CCT, tr_STS 99 | ): 100 | """ 101 | betas : array 102 | n*k, estimated coefficients 103 | 104 | predict : array 105 | n*1, predict y values 106 | 107 | CCT : array 108 | n*k, scaled variance-covariance matrix 109 | 110 | df_model : integer 111 | model degrees of freedom 112 | 113 | df_reside : integer 114 | residual degrees of freedom 115 | 116 | reside : array 117 | n*1, residuals of the response 118 | 119 | RSS : scalar 120 | residual sum of squares 121 | 122 | CCT : array 123 | n*k, scaled variance-covariance matrix 124 | 125 | ENP : scalar 126 | effective number of parameters, which depends on 127 | sigma2 128 | 129 | tr_S : float 130 | trace of S (hat) matrix 131 | 132 | tr_STS : float 133 | trace of STS matrix 134 | 135 | R2 : float 136 | R-squared for the entire model (1- RSS/TSS) 137 | 138 | adj_R2 : float 139 | adjusted R-squared for the entire model 140 | 141 | aic : float 142 | Akaike information criterion 143 | 144 | aicc : float 145 | corrected Akaike information criterion 146 | to account for model complexity (smaller 147 | bandwidths) 148 | 149 | bic : float 150 | Bayesian information criterion 151 | 152 | sigma2 : float 153 | sigma squared (residual variance) that has been 154 | corrected to account for the ENP 155 | 156 | std_res : array 157 | n*1, standardised residuals 158 | 159 | bse : array 160 | n*k, standard errors of parameters (betas) 161 | 162 | influ : array 163 | n*1, leading diagonal of S matrix 164 | 165 | CooksD : array 166 | n*1, Cook's D 167 | 168 | tvalues : array 169 | n*k, local t-statistics 170 | 171 | llf : scalar 172 | log-likelihood of the full model; see 173 | pysal.contrib.glm.family for damily-sepcific 174 | log-likelihoods 175 | """ 176 | 177 | super(GWRResults, self).__init__( 178 | X, y, kernel, fixed, influ, reside, predict_value, betas, tr_STS) 179 | self.coords = coords 180 | self.bw = bw 181 | self.CCT = CCT * self.sigma2 182 | self.bse = np.sqrt(self.CCT) 183 | self.tvalues = self.betas / self.bse 184 | 185 | 186 | class GTWRResults(Results): 187 | 188 | def __init__( 189 | self, coords, t, X, y, bw, tau, kernel, fixed, influ, reside, predict_value, betas, CCT, tr_STS 190 | ): 191 | """ 192 | tau: : scalar 193 | spatio-temporal scale 194 | bw_s : scalar 195 | spatial bandwidth 196 | bw_t : scalar 197 | temporal bandwidth 198 | See Also GWRResults 199 | """ 200 | 201 | super(GTWRResults, self).__init__(X, y, kernel, fixed, influ, reside, predict_value, betas, tr_STS) 202 | self.coords = coords 203 | self.t = t 204 | self.bw = bw 205 | self.tau = tau 206 | self.bw_s = self.bw 207 | self.bw_t = np.sqrt(self.bw ** 2 / self.tau) 208 | self.CCT = CCT * self.sigma2 209 | self.bse = np.sqrt(self.CCT) 210 | self.tvalues = self.betas / self.bse 211 | 212 | 213 | class MGWRResults(BaseModel): 214 | 215 | def __init__(self, coords, X, y, bws, kernel, fixed, bws_history, betas, 216 | predict_value, ENP_j, CCT): 217 | """ 218 | bws : array-like 219 | corresponding spatial bandwidth of all variables 220 | ENP_j : array-like 221 | effective number of paramters, which depends on 222 | sigma2, for each covariate in the model 223 | 224 | See Also GWRResults 225 | """ 226 | super(MGWRResults, self).__init__(X, y, kernel, fixed, constant=False) 227 | self.coords = coords 228 | self.bws = bws 229 | self.bws_history = bws_history 230 | self.predict_value = predict_value 231 | self.betas = betas 232 | self.reside = self.y - self.predict_value 233 | self.TSS = np.sum((self.y - np.mean(self.y)) ** 2) 234 | self.RSS = np.sum(self.reside ** 2) 235 | self.R2 = 1 - self.RSS / self.TSS 236 | self.llf = -np.log(self.RSS) * self.n / 2 - (1 + np.log(np.pi / self.n * 2)) * self.n / 2 237 | self.bic = -2.0 * self.llf + (self.k + 1) * np.log(self.n) 238 | if ENP_j is not None: 239 | self.ENP_j = ENP_j 240 | self.tr_S = np.sum(self.ENP_j) 241 | self.ENP = self.tr_S 242 | self.sigma2 = self.RSS / (self.n - self.tr_S) 243 | self.CCT = CCT * self.sigma2 244 | self.bse = np.sqrt(self.CCT) 245 | self.t_values = self.betas / self.bse 246 | self.df_model = self.n - self.tr_S 247 | self.adj_R2 = 1 - (1 - self.R2) * (self.n - 1) / (self.n - self.ENP - 1) 248 | self.aic = -2.0 * self.llf + 2.0 * (self.tr_S + 1) 249 | self.aic_c = self.aic + 2.0 * self.tr_S * (self.tr_S + 1.0) / (self.n - self.tr_S - 1.0) 250 | 251 | 252 | class MGTWRResults(MGWRResults): 253 | 254 | def __init__(self, coords, t, X, y, bws, taus, kernel, fixed, bw_ts, bws_history, taus_history, betas, 255 | predict_value, ENP_j, CCT): 256 | """ 257 | taus : array-like 258 | corresponding spatio-temporal scale of all variables 259 | bws : array-like 260 | corresponding spatio bandwidth of all variables 261 | bw_ts : array-like 262 | corresponding temporal bandwidth of all variables 263 | See Also 264 | ------------- 265 | MGWRResults 266 | GWRResults 267 | """ 268 | super(MGTWRResults, self).__init__( 269 | coords, X, y, bws, kernel, fixed, bws_history, betas, predict_value, ENP_j, CCT) 270 | self.t = t 271 | self.taus = taus 272 | self.bw_ts = bw_ts 273 | self.taus_history = taus_history 274 | -------------------------------------------------------------------------------- /mgtwr/sel.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from typing import Union 3 | import pandas as pd 4 | from .diagnosis import get_AICc, get_AIC, get_BIC, get_CV 5 | from .obj import BaseModel 6 | from scipy.spatial.distance import pdist 7 | from .model import GWR, GTWR 8 | from .function import golden_section, surface_to_plane, print_time, twostep_golden_section, multi_bw, multi_bws 9 | 10 | getDiag = {'AICc': get_AICc, 'AIC': get_AIC, 'BIC': get_BIC, 'CV': get_CV} 11 | 12 | delta = 0.38197 13 | 14 | 15 | class SearchGWRParameter(BaseModel): 16 | """ 17 | Select bandwidth for GWR model 18 | 19 | Parameters 20 | ---------- 21 | coords : array-like 22 | n*2, collection of n sets of (x,y) coordinates of 23 | observations 24 | 25 | y : array-like 26 | n*1, dependent variable 27 | 28 | X : array-like 29 | n*k, independent variable, excluding the constant 30 | 31 | kernel : string 32 | type of kernel function used to weight observations; 33 | available options: 34 | 'gaussian' 35 | 'bisquare' 36 | 'exponential' 37 | 38 | fixed : boolean 39 | True for distance based kernel function and False for 40 | adaptive (nearest neighbor) kernel function (default) 41 | 42 | constant : boolean 43 | True to include intercept (default) in model and False to exclude 44 | intercept. 45 | 46 | Examples 47 | -------- 48 | import numpy as np 49 | from mgtwr.sel import SearchGWRParameter 50 | np.random.seed(1) 51 | u = np.array([(i-1) % 12 for i in range(1, 1729)]).reshape(-1, 1) 52 | v = np.array([((i-1) % 144) // 12 for i in range(1, 1729)]).reshape(-1, 1) 53 | t = np.array([(i-1) // 144 for i in range(1, 1729)]).reshape(-1, 1) 54 | x1 = np.random.uniform(0, 1, (1728, 1)) 55 | x2 = np.random.uniform(0, 1, (1728, 1)) 56 | epsilon = np.random.randn(1728, 1) 57 | beta0 = 5 58 | beta1 = 3 + (u + v + t)/6 59 | beta2 = 3 + ((36-(6-u)**2)*(36-(6-v)**2)*(36-(6-t)**2)) / 128 60 | y = beta0 + beta1 * x1 + beta2 * x2 + epsilon 61 | coords = np.hstack([u, v]) 62 | X = np.hstack([x1, x2]) 63 | sel = SearchGWRParameter(coords, X, y, kernel='gaussian', fixed=True) 64 | bw = sel.search(bw_max=40, verbose=True) 65 | 2.0 66 | """ 67 | 68 | def __init__( 69 | self, 70 | coords: Union[np.ndarray, pd.DataFrame], 71 | X: Union[np.ndarray, pd.DataFrame], 72 | y: Union[np.ndarray, pd.DataFrame], 73 | kernel: str = 'exponential', 74 | fixed: bool = False, 75 | constant: bool = True, 76 | convert: bool = False, 77 | thread: int = 1 78 | ): 79 | 80 | super(SearchGWRParameter, self).__init__(X, y, kernel, fixed, constant) 81 | if isinstance(coords, pd.DataFrame): 82 | coords = coords.values 83 | self.coords = coords 84 | if convert: 85 | longitude = coords[:, 0] 86 | latitude = coords[:, 1] 87 | longitude, latitude = surface_to_plane(longitude, latitude) 88 | self.coords = np.hstack([longitude, latitude]) 89 | self.int_score = not self.fixed 90 | self.thread = thread 91 | 92 | @print_time 93 | def search(self, 94 | criterion: str = 'AICc', 95 | bw_min: float = None, 96 | bw_max: float = None, 97 | tol: float = 1.0e-6, 98 | bw_decimal: int = 0, 99 | max_iter: int = 200, 100 | verbose: bool = True, 101 | time_cost: bool = False 102 | ): 103 | """ 104 | Method to select one unique bandwidth for a GWR model. 105 | 106 | Parameters 107 | ---------- 108 | criterion : string 109 | bw selection criterion: 'AICc', 'AIC', 'BIC', 'CV' 110 | bw_min : float 111 | min value used in bandwidth search 112 | bw_max : float 113 | max value used in bandwidth search 114 | tol : float 115 | tolerance used to determine convergence 116 | max_iter : integer 117 | max iterations if no convergence to tol 118 | 119 | bw_decimal : scalar 120 | The number of bandwidth's decimal places saved during the search 121 | 122 | verbose : bool 123 | If true, bandwidth searching history is printed out; default is False. 124 | time_cost : bool 125 | If true, print run time 126 | """ 127 | 128 | def gwr_func(x): 129 | return getDiag[criterion](GWR( 130 | self.coords, self.X, self.y, x, kernel=self.kernel, 131 | fixed=self.fixed, constant=False, thread=self.thread).cal_aic()) 132 | 133 | bw_min, bw_max = self._init_section(bw_min, bw_max) 134 | bw = golden_section(bw_min, bw_max, delta, bw_decimal, gwr_func, tol, max_iter, verbose) 135 | return bw 136 | 137 | def _init_section(self, bw_min, bw_max): 138 | if bw_min is not None and bw_max is not None: 139 | return bw_min, bw_max 140 | 141 | if len(self.X) > 0: 142 | n_glob = self.X.shape[1] 143 | else: 144 | n_glob = 0 145 | if self.constant: 146 | n_vars = n_glob + 1 147 | else: 148 | n_vars = n_glob 149 | n = np.array(self.coords).shape[0] 150 | 151 | if self.int_score: 152 | a = 40 + 2 * n_vars 153 | c = n 154 | else: 155 | try: 156 | coords = np.unique(self.coords, axis=0) 157 | sq_dists = pdist(coords) 158 | a = np.min(sq_dists) / 2.0 159 | c = np.max(sq_dists) 160 | except MemoryError: 161 | # Note that the value obtained in this way is not the maximum distance of all points, 162 | # but the upper bound of the search has little effect on the results of the model 163 | coords = sorted(self.coords, key=lambda x: x[0] ** 2 + x[1] ** 2) 164 | a = pdist(coords[:2])[0] 165 | c = pdist([coords[0], coords[-1]])[0] 166 | if bw_min is None: 167 | bw_min = a 168 | if bw_max is None: 169 | bw_max = c 170 | 171 | return bw_min, bw_max 172 | 173 | 174 | class SearchMGWRParameter(BaseModel): 175 | 176 | def __init__( 177 | self, 178 | coords: Union[np.ndarray, pd.DataFrame], 179 | X: Union[np.ndarray, pd.DataFrame], 180 | y: Union[np.ndarray, pd.DataFrame], 181 | kernel: str = 'exponential', 182 | fixed: bool = False, 183 | constant: bool = True, 184 | convert: bool = False, 185 | thread: int = 1 186 | ): 187 | 188 | super(SearchMGWRParameter, self).__init__(X, y, kernel, fixed, constant) 189 | if isinstance(coords, pd.DataFrame): 190 | coords = coords.values 191 | self.coords = coords 192 | if convert: 193 | longitude = coords[:, 0] 194 | latitude = coords[:, 1] 195 | longitude, latitude = surface_to_plane(longitude, latitude) 196 | self.coords = np.hstack([longitude, latitude]) 197 | self.int_score = not self.fixed 198 | self.thread = thread 199 | self.criterion = None 200 | self.bws = None 201 | self.tol = None 202 | self.bw_decimal = None 203 | 204 | @print_time 205 | def search( 206 | self, 207 | criterion: str = 'AICc', 208 | bw_min: float = None, 209 | bw_max: float = None, 210 | tol: float = 1.0e-6, 211 | bw_decimal: int = 1, 212 | init_bw: float = None, 213 | multi_bw_min: list = None, 214 | multi_bw_max: list = None, 215 | tol_multi: float = 1.0e-5, 216 | bws_same_times: int = 5, 217 | verbose: bool = False, 218 | rss_score: bool = False, 219 | time_cost: bool = False 220 | ): 221 | """ 222 | Method to select one unique bandwidth and Spatio-temporal scale for a gtwr model or a 223 | bandwidth vector and Spatio-temporal scale vector for a mgwr model. 224 | 225 | Parameters 226 | ---------- 227 | criterion : string 228 | bw selection criterion: 'AICc', 'AIC', 'BIC', 'CV' 229 | bw_min : float 230 | min value used in bandwidth search 231 | bw_max : float 232 | max value used in bandwidth search 233 | multi_bw_min : list 234 | min values used for each covariate in mgwr bandwidth search. 235 | Must be either a single value or have one value for 236 | each covariate including the intercept 237 | multi_bw_max : list 238 | max values used for each covariate in mgwr bandwidth 239 | search. Must be either a single value or have one value 240 | for each covariate including the intercept 241 | tol : float 242 | tolerance used to determine convergence 243 | bw_decimal : int 244 | The number of bw decimal places reserved 245 | init_bw : float 246 | None (default) to initialize MGTWR with a bandwidth 247 | derived from GTWR. Otherwise this option will choose the 248 | bandwidth to initialize MGWR with. 249 | tol_multi : convergence tolerance for the multiple bandwidth 250 | back fitting algorithm; a larger tolerance may stop the 251 | algorithm faster though it may result in a less optimal 252 | model 253 | bws_same_times : If bandwidths keep the same between iterations for 254 | bws_same_times (default 5) in backfitting, then use the 255 | current set of bandwidths as final bandwidths. 256 | rss_score : True to use the residual sum of squares to evaluate 257 | each iteration of the multiple bandwidth back fitting 258 | routine and False to use a smooth function; default is 259 | False 260 | verbose : Boolean 261 | If true, bandwidth searching history is printed out; default is False. 262 | time_cost : bool 263 | If true, print run time 264 | """ 265 | self.criterion = criterion 266 | self.tol = tol 267 | self.bw_decimal = bw_decimal 268 | if multi_bw_min is not None: 269 | if len(multi_bw_min) == self.k: 270 | multi_bw_min = multi_bw_min 271 | elif len(multi_bw_min) == 1: 272 | multi_bw_min = multi_bw_min * self.k 273 | else: 274 | raise AttributeError( 275 | "multi_bw_min must be either a list containing" 276 | " a single entry or a list containing an entry for each of k" 277 | " covariates including the intercept") 278 | else: 279 | a = self._init_section(bw_min, bw_max)[0] 280 | multi_bw_min = [a] * self.k 281 | 282 | if multi_bw_max is not None: 283 | if len(multi_bw_max) == self.k: 284 | multi_bw_max = multi_bw_max 285 | elif len(multi_bw_max) == 1: 286 | multi_bw_max = multi_bw_max * self.k 287 | else: 288 | raise AttributeError( 289 | "multi_bw_max must be either a list containing" 290 | " a single entry or a list containing an entry for each of k" 291 | " covariates including the intercept") 292 | else: 293 | c = self._init_section(bw_min, bw_max)[1] 294 | multi_bw_max = [c] * self.k 295 | 296 | self.bws = multi_bw(init_bw, self.X, self.y, self.n, self.k, tol_multi, 297 | rss_score, self.gwr_func, self.bw_func, self.sel_func, multi_bw_min, multi_bw_max, 298 | bws_same_times, verbose=verbose) 299 | return self.bws 300 | 301 | def gwr_func(self, X, y, bw): 302 | res = GWR(self.coords, X, y, bw, kernel=self.kernel, 303 | fixed=self.fixed, constant=False, thread=self.thread).cal_multi() 304 | return res 305 | 306 | def bw_func(self, X, y): 307 | selector = SearchGWRParameter(self.coords, X, y, kernel=self.kernel, fixed=self.fixed, 308 | constant=False, thread=self.thread) 309 | return selector 310 | 311 | def sel_func(self, bw_func, bw_min=None, bw_max=None): 312 | return bw_func.search(criterion=self.criterion, bw_min=bw_min, bw_max=bw_max, 313 | tol=self.tol, bw_decimal=self.bw_decimal, verbose=False) 314 | 315 | def _init_section(self, bw_min, bw_max): 316 | 317 | a = bw_min if bw_min is not None else 0 318 | if bw_max is not None: 319 | c = bw_max 320 | else: 321 | c = max(np.max(self.coords[:, 0]) - np.min(self.coords[:, 0]), 322 | np.max(self.coords[:, 1]) - np.min(self.coords[:, 1])) 323 | return a, c 324 | 325 | 326 | class SearchGTWRParameter(BaseModel): 327 | """ 328 | Select bandwidth for GTWR model 329 | 330 | Parameters 331 | ---------- 332 | coords : array-like 333 | n*2, collection of n sets of (x,y) coordinates of 334 | observations 335 | 336 | t : array-like 337 | n*1, time location 338 | 339 | y : array-like 340 | n*1, dependent variable 341 | 342 | X : array-like 343 | n*k, independent variable, excluding the constant 344 | 345 | kernel : string 346 | type of kernel function used to weight observations; 347 | available options: 348 | 'gaussian' 349 | 'bisquare' 350 | 'exponential' 351 | 352 | fixed : boolean 353 | True for distance based kernel function and False for 354 | adaptive (nearest neighbor) kernel function (default) 355 | 356 | constant : boolean 357 | True to include intercept (default) in model and False to exclude 358 | intercept. 359 | 360 | Examples 361 | -------- 362 | import numpy as np 363 | from mgtwr.sel import SearchGTWRParameter 364 | np.random.seed(1) 365 | u = np.array([(i-1) % 12 for i in range(1, 1729)]).reshape(-1, 1) 366 | v = np.array([((i-1) % 144)//12 for i in range(1, 1729)]).reshape(-1, 1) 367 | t = np.array([(i-1) // 144 for i in range(1, 1729)]).reshape(-1, 1) 368 | x1 = np.random.uniform(0, 1, (1728, 1)) 369 | x2 = np.random.uniform(0, 1, (1728, 1)) 370 | epsilon = np.random.randn(1728, 1) 371 | beta0 = 5 372 | beta1 = 3 + (u + v + t)/6 373 | beta2 = 3 + ((36-(6-u)**2)*(36-(6-v)**2)*(36-(6-t)**2)) / 128 374 | y = beta0 + beta1 * x1 + beta2 * x2 + epsilon 375 | coords = np.hstack([u, v]) 376 | X = np.hstack([x1, x2]) 377 | sel = SearchGTWRParameter(coords, t, X, y, kernel='gaussian', fixed=True) 378 | bw, tau = sel.search(tau_max=20, verbose=True) 379 | 0.9, 1.5 380 | """ 381 | 382 | def __init__( 383 | self, 384 | coords: np.ndarray, 385 | t: np.ndarray, 386 | X: np.ndarray, 387 | y: np.ndarray, 388 | kernel: str = 'exponential', 389 | fixed: bool = False, 390 | constant: bool = True, 391 | convert: bool = False, 392 | thread: int = 1 393 | ): 394 | 395 | super(SearchGTWRParameter, self).__init__(X, y, kernel, fixed, constant) 396 | if isinstance(coords, pd.DataFrame): 397 | coords = coords.values 398 | self.coords = coords 399 | if convert: 400 | longitude = coords[:, 0] 401 | latitude = coords[:, 1] 402 | longitude, latitude = surface_to_plane(longitude, latitude) 403 | self.coords = np.hstack([longitude, latitude]) 404 | self.t = t 405 | self.int_score = not self.fixed 406 | self.thread = thread 407 | 408 | @print_time 409 | def search( 410 | self, 411 | criterion: str = 'AICc', 412 | bw_min: float = None, 413 | bw_max: float = None, 414 | tau_min: float = None, 415 | tau_max: float = None, 416 | tol: float = 1.0e-6, 417 | bw_decimal: int = 1, 418 | tau_decimal: int = 1, 419 | max_iter: int = 200, 420 | verbose: bool = False, 421 | time_cost: bool = False 422 | ): 423 | """ 424 | Method to select one unique bandwidth and Spatio-temporal scale for a GTWR model. 425 | 426 | Parameters 427 | ---------- 428 | criterion : string 429 | bw selection criterion: 'AICc', 'AIC', 'BIC', 'CV' 430 | bw_min : float 431 | min value used in bandwidth search 432 | bw_max : float 433 | max value used in bandwidth search 434 | tau_min : float 435 | min value used in spatio-temporal scale search 436 | tau_max : float 437 | max value used in spatio-temporal scale search 438 | tol : float 439 | tolerance used to determine convergence 440 | max_iter : integer 441 | max iterations if no convergence to tol 442 | bw_decimal : scalar 443 | The number of bandwidth's decimal places saved during the search 444 | tau_decimal : scalar 445 | The number of Spatio-temporal decimal places saved during the search 446 | verbose : Boolean 447 | If true, bandwidth searching history is printed out; default is False. 448 | time_cost : bool 449 | If true, print run time 450 | """ 451 | 452 | def gtwr_func(x, y): 453 | return getDiag[criterion](GTWR( 454 | self.coords, self.t, self.X, self.y, x, y, kernel=self.kernel, 455 | fixed=self.fixed, constant=False, thread=self.thread).cal_aic()) 456 | 457 | bw_min, bw_max, tau_min, tau_max = self._init_section(bw_min, bw_max, tau_min, tau_max) 458 | bw, tau = twostep_golden_section(bw_min, bw_max, tau_min, tau_max, delta, gtwr_func, tol, max_iter, bw_decimal, 459 | tau_decimal, verbose) 460 | 461 | return bw, tau 462 | 463 | def _init_section(self, bw_min, bw_max, tau_min, tau_max): 464 | if (bw_min is not None) and (bw_max is not None) and (tau_min is not None) and (tau_max is not None): 465 | return bw_min, bw_max, tau_min, tau_max 466 | if len(self.X) > 0: 467 | n_glob = self.X.shape[1] 468 | else: 469 | n_glob = 0 470 | if self.constant: 471 | n_vars = n_glob + 1 472 | else: 473 | n_vars = n_glob 474 | n = np.array(self.coords).shape[0] 475 | 476 | if self.int_score: 477 | a = 40 + 2 * n_vars 478 | c = n 479 | else: 480 | try: 481 | coords = np.unique(self.coords, axis=0) 482 | sq_dists = pdist(coords) 483 | a = np.min(sq_dists) / 2.0 484 | c = np.max(sq_dists) 485 | except MemoryError: 486 | # Note that the value obtained in this way is not the maximum distance of all points, 487 | # but the upper bound of the search has little effect on the results of the model 488 | coords = sorted(self.coords, key=lambda x: x[0] ** 2 + x[1] ** 2) 489 | a = pdist(coords[:2])[0] 490 | c = pdist([coords[0], coords[-1]])[0] 491 | 492 | if bw_min is None: 493 | bw_min = a 494 | if bw_max is None: 495 | bw_max = c 496 | 497 | if tau_min is None: 498 | tau_min = 0 499 | if tau_max is None: 500 | tau_max = 4 501 | return bw_min, bw_max, tau_min, tau_max 502 | 503 | 504 | class SearchMGTWRParameter(BaseModel): 505 | """ 506 | Select bandwidth for MGTWR model 507 | 508 | Parameters 509 | ---------- 510 | coords : array-like 511 | n*2, collection of n sets of (x,y) coordinates of 512 | observations 513 | 514 | t : array-like 515 | n*1, time location 516 | 517 | X : array-like 518 | n*k, independent variable, excluding the constant 519 | 520 | y : array-like 521 | n*1, dependent variable 522 | 523 | kernel : string 524 | type of kernel function used to weight observations; 525 | available options: 526 | 'gaussian' 527 | 'bisquare' 528 | 'exponential' 529 | 530 | fixed : bool 531 | True for distance based kernel function and False for 532 | adaptive (nearest neighbor) kernel function (default) 533 | 534 | constant : bool 535 | True to include intercept (default) in model and False to exclude 536 | intercept. 537 | 538 | Examples 539 | -------- 540 | import numpy as np 541 | from mgtwr.sel import SearchMGTWRParameter 542 | from mgtwr.model import MGTWR 543 | np.random.seed(10) 544 | u = np.array([(i-1) % 12 for i in range(1, 1729)]).reshape(-1, 1) 545 | v = np.array([((i-1) % 144)//12 for i in range(1, 1729)]).reshape(-1, 1) 546 | t = np.array([(i-1) // 144 for i in range(1, 1729)]).reshape(-1, 1) 547 | x1 = np.random.uniform(0, 1, (1728, 1)) 548 | x2 = np.random.uniform(0, 1, (1728, 1)) 549 | epsilon = np.random.randn(1728, 1) 550 | beta0 = 5 551 | beta1 = 3 + (u + v + t)/6 552 | beta2 = 3 + ((36-(6-u)**2)*(36-(6-v)**2)*(36-(6-t)**2)) / 128 553 | y = beta0 + beta1 * x1 + beta2 * x2 + epsilon 554 | coords = np.hstack([u, v]) 555 | X = np.hstack([x1, x2]) 556 | sel_multi = SearchMGTWRParameter(coords, t, X, y, kernel='gaussian', fixed=True) 557 | bws = sel_multi.search(multi_bw_min=[0.1], verbose=True, tol_multi=1.0e-4) 558 | mgtwr = MGTWR(coords, t, X, y, sel_multi, kernel='gaussian', fixed=True).fit() 559 | print(mgtwr.R2) 560 | 0.9972924820674222 561 | """ 562 | def __init__( 563 | self, 564 | coords: np.ndarray, 565 | t: np.ndarray, 566 | X: np.ndarray, 567 | y: np.ndarray, 568 | kernel: str = 'exponential', 569 | fixed: bool = False, 570 | constant: bool = True, 571 | convert: bool = False, 572 | thread: int = 1 573 | ): 574 | 575 | super(SearchMGTWRParameter, self).__init__(X, y, kernel, fixed, constant) 576 | if isinstance(coords, pd.DataFrame): 577 | coords = coords.values 578 | self.coords = coords 579 | if convert: 580 | longitude = coords[:, 0] 581 | latitude = coords[:, 1] 582 | longitude, latitude = surface_to_plane(longitude, latitude) 583 | self.coords = np.hstack([longitude, latitude]) 584 | self.t = t 585 | self.int_score = not self.fixed 586 | self.thread = thread 587 | self.criterion = None 588 | self.bws = None 589 | self.tol = None 590 | self.bw_decimal = None 591 | self.tau_decimal = None 592 | 593 | @print_time 594 | def search( 595 | self, 596 | criterion: str = 'AICc', 597 | bw_min: float = None, 598 | bw_max: float = None, 599 | tau_min: float = None, 600 | tau_max: float = None, 601 | tol: float = 1.0e-6, 602 | bw_decimal: int = 1, 603 | tau_decimal: int = 1, 604 | init_bw: float = None, 605 | init_tau: float = None, 606 | multi_bw_min: list = None, 607 | multi_bw_max: list = None, 608 | multi_tau_min: list = None, 609 | multi_tau_max: list = None, 610 | tol_multi: float = 1.0e-5, 611 | verbose: bool = False, 612 | rss_score: bool = False, 613 | time_cost: bool = False 614 | ): 615 | """ 616 | Method to select one unique bandwidth and Spatio-temporal scale for a gtwr model or a 617 | bandwidth vector and Spatio-temporal scale vector for a mtgwr model. 618 | 619 | Parameters 620 | ---------- 621 | criterion : string 622 | bw selection criterion: 'AICc', 'AIC', 'BIC', 'CV' 623 | bw_min : float 624 | min value used in bandwidth search 625 | bw_max : float 626 | max value used in bandwidth search 627 | tau_min : float 628 | min value used in spatio-temporal scale search 629 | tau_max : float 630 | max value used in spatio-temporal scale search 631 | multi_bw_min : list 632 | min values used for each covariate in mgwr bandwidth search. 633 | Must be either a single value or have one value for 634 | each covariate including the intercept 635 | multi_bw_max : list 636 | max values used for each covariate in mgwr bandwidth 637 | search. Must be either a single value or have one value 638 | for each covariate including the intercept 639 | multi_tau_min : list 640 | min values used for each covariate in mgtwr spatio-temporal scale 641 | search. Must be either a single value or have one value 642 | for each covariate including the intercept 643 | multi_tau_max : max values used for each covariate in mgtwr spatio-temporal scale 644 | search. Must be either a single value or have one value 645 | for each covariate including the intercept 646 | tol : float 647 | tolerance used to determine convergence 648 | bw_decimal : int 649 | The number of bw decimal places reserved 650 | tau_decimal : int 651 | The number of tau decimal places reserved 652 | init_bw : float 653 | None (default) to initialize MGTWR with a bandwidth 654 | derived from GTWR. Otherwise this option will choose the 655 | bandwidth to initialize MGWR with. 656 | init_tau : float 657 | None (default) to initialize MGTWR with a spatio-temporal scale 658 | derived from GTWR. Otherwise this option will choose the 659 | spatio-temporal scale to initialize MGWR with. 660 | tol_multi : convergence tolerance for the multiple bandwidth 661 | back fitting algorithm; a larger tolerance may stop the 662 | algorithm faster though it may result in a less optimal 663 | model 664 | rss_score : True to use the residual sum of squares to evaluate 665 | each iteration of the multiple bandwidth back fitting 666 | routine and False to use a smooth function; default is 667 | False 668 | verbose : Boolean 669 | If true, bandwidth searching history is printed out; default is False. 670 | time_cost : bool 671 | If true, print run time 672 | """ 673 | self.criterion = criterion 674 | self.tol = tol 675 | self.bw_decimal = bw_decimal 676 | self.tau_decimal = tau_decimal 677 | if multi_bw_min is not None: 678 | if len(multi_bw_min) == self.k: 679 | multi_bw_min = multi_bw_min 680 | elif len(multi_bw_min) == 1: 681 | multi_bw_min = multi_bw_min * self.k 682 | else: 683 | raise AttributeError( 684 | "multi_bw_min must be either a list containing" 685 | " a single entry or a list containing an entry for each of k" 686 | " covariates including the intercept") 687 | else: 688 | a = self._init_section(bw_min, bw_max, tau_min, tau_max)[0] 689 | multi_bw_min = [a] * self.k 690 | 691 | if multi_bw_max is not None: 692 | if len(multi_bw_max) == self.k: 693 | multi_bw_max = multi_bw_max 694 | elif len(multi_bw_max) == 1: 695 | multi_bw_max = multi_bw_max * self.k 696 | else: 697 | raise AttributeError( 698 | "multi_bw_max must be either a list containing" 699 | " a single entry or a list containing an entry for each of k" 700 | " covariates including the intercept") 701 | else: 702 | c = self._init_section(bw_min, bw_max, tau_min, tau_max)[1] 703 | multi_bw_max = [c] * self.k 704 | 705 | if multi_tau_min is not None: 706 | if len(multi_tau_min) == self.k: 707 | multi_tau_min = multi_tau_min 708 | elif len(multi_tau_min) == 1: 709 | multi_tau_min = multi_tau_min * self.k 710 | else: 711 | raise AttributeError( 712 | "multi_tau_min must be either a list containing" 713 | " a single entry or a list containing an entry for each of k" 714 | " variates including the intercept") 715 | else: 716 | A = self._init_section(bw_min, bw_max, tau_min, tau_max)[2] 717 | multi_tau_min = [A] * self.k 718 | 719 | if multi_tau_max is not None: 720 | if len(multi_tau_max) == self.k: 721 | multi_tau_max = multi_tau_max 722 | elif len(multi_tau_max) == 1: 723 | multi_tau_max = multi_tau_max * self.k 724 | else: 725 | raise AttributeError( 726 | "multi_tau_max must be either a list containing" 727 | " a single entry or a list containing an entry for each of k" 728 | " variates including the intercept") 729 | else: 730 | C = self._init_section(bw_min, bw_max, tau_min, tau_max)[3] 731 | multi_tau_max = [C] * self.k 732 | 733 | self.bws = multi_bws(init_bw, init_tau, self.X, self.y, self.n, self.k, tol_multi, 734 | rss_score, self.gtwr_func, self.bw_func, self.sel_func, multi_bw_min, multi_bw_max, 735 | multi_tau_min, multi_tau_max, verbose=verbose) 736 | return self.bws 737 | 738 | def gtwr_func(self, X, y, bw, tau): 739 | return GTWR(self.coords, self.t, X, y, bw, tau, kernel=self.kernel, 740 | fixed=self.fixed, constant=False, thread=self.thread).cal_multi() 741 | 742 | def bw_func(self, X, y): 743 | selector = SearchGTWRParameter(self.coords, self.t, X, y, kernel=self.kernel, fixed=self.fixed, 744 | constant=False, thread=self.thread) 745 | return selector 746 | 747 | def sel_func(self, bw_func, bw_min=None, bw_max=None, tau_min=None, tau_max=None): 748 | return bw_func.search(criterion=self.criterion, bw_min=bw_min, bw_max=bw_max, tau_min=tau_min, tau_max=tau_max, 749 | tol=self.tol, bw_decimal=self.bw_decimal, tau_decimal=self.tau_decimal, verbose=False) 750 | 751 | def _init_section(self, bw_min, bw_max, tau_min, tau_max): 752 | 753 | a = bw_min if bw_min is not None else 0 754 | if bw_max is not None: 755 | c = bw_max 756 | else: 757 | c = max(np.max(self.coords[:, 0]) - np.min(self.coords[:, 0]), 758 | np.max(self.coords[:, 1]) - np.min(self.coords[:, 1])) 759 | 760 | A = tau_min if tau_min is not None else 0 761 | C = tau_max if tau_max is not None else 4 762 | 763 | return a, c, A, C 764 | -------------------------------------------------------------------------------- /mgtwr/setup.py: -------------------------------------------------------------------------------- 1 | import setuptools 2 | 3 | setuptools.setup( 4 | version="2.0.5", 5 | long_description="To fit geographically weighted model, " 6 | "multiscale geographically weighted regression model, " 7 | "geographically and temporally weighted regression model and " 8 | "multiscale geographically and temporally weighted regression model.", 9 | author="Kun Sun", 10 | author_email="849024477@qq.com", 11 | packages=['mgtwr'], 12 | url="https://github.com/sunkun1997/mgtwr", 13 | ) 14 | --------------------------------------------------------------------------------