├── LICENSE ├── README.md ├── heatplot.ado ├── heatplot.pkg ├── heatplot.sthlp ├── hexplot.ado ├── hexplot.sthlp └── stata.toc /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 benjann 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # heatplot 2 | Stata module to create heat plots and hexagon plots 3 | 4 | `heatplot` creates heat plots from variables or matrices. One 5 | example of a heat plot is a two-dimensional histogram in which the 6 | frequencies of combinations of binned Y and X are displayed as 7 | rectangular (or hexagonal) fields using a color gradient. Another example 8 | is a plot of a trivariate distribution where the color gradient is used to 9 | visualize the (average) value of Z within bins of Y and 10 | X. Yet another example is a plot that displays the contents of a matrix, 11 | say, a correlation matrix or a spacial weights matrix, using a color 12 | gradient. 13 | 14 | To install `heatplot` from the SSC Archive, type 15 | 16 | . ssc install heatplot, replace 17 | 18 | in Stata. The `palettes` package and, in Stata 14.2 or newer, 19 | the `colrspace` package are required. To install these packages, type 20 | 21 | . ssc install palettes, replace 22 | . ssc install colrspace, replace 23 | 24 | Furthermore, the `fast` option of `heatplot` of requires the `gtools` package. To 25 | install `gtools`, type 26 | 27 | . ssc install gtools, replace 28 | . gtools, upgrade 29 | 30 | --- 31 | 32 | Installation from GitHub: 33 | 34 | . net install heatplot, replace from(https://raw.githubusercontent.com/benjann/heatplot/master/) 35 | 36 | --- 37 | 38 | Main changes: 39 | 40 | 24aug2021 41 | - textbox_options and legend_options were not fully supported in 42 | keylabels(); this is fixed 43 | - position() in legend() in by() was not passed trough; this is fixed 44 | - improved checks for required packages and corresponding error messages 45 | 46 | 20jul2021 47 | - values(label(exp)) can now be string in syntax 1; the statistic() suboption 48 | will be set to -first- in this case 49 | 50 | 19jul2021 51 | - new [x|y]bcuts() option to cut x and y at arbitrary values (not allowed with 52 | hexagon) 53 | - option values() has been revised; new label() suboption can be used to select 54 | a secondary variable or matrix for the values; new transform() suboption 55 | transforms the values; other suboptions have been renamed 56 | - size(exp) is now also allowed in syntax 2 and 3, where exp is the name of a 57 | (mata) matrix; size() now has a statistic() suboption to set the type of 58 | aggregation; observations for which exp is missing are no longer excluded 59 | from the estimation sample 60 | - new -normalize- option normalizes the plotted results by dividing by the size 61 | of the corresponding color field 62 | - if [x|y]discrete is specified together with hexagon, a color field is now printed 63 | at each unique value (similar the behavior without option hexagon) 64 | - shapes of clipped hexagons were not always correct; this is fixed 65 | - option generate failed in syntax 2 and 3 if the current dataset was empty 66 | (unless -nopreserve- was specified); this is fixed 67 | - other than stated in the documentation, palette -hcl, viridis- was used as the 68 | default palette in Stata 14.2 or newer instead of palette -viridis-; this is fixed 69 | 70 | 13oct2020 71 | - option colors() did not work with color specifications that included 72 | quotes; this is fixed 73 | 74 | 07sep2019 75 | - new ramp() option 76 | - new equations() option in syntax 3 77 | - heatplot could break if there were only very few observations; this is fixed 78 | 79 | 21jun2019 80 | - a note is now displayed if there are observations outside the binning range of y and x 81 | - binning of x and y was erroneous at the edges if subobtion tight was specified and 82 | the requested binning range was smaller than the data range; this is fixed 83 | - undocumented idgenerate() option to store bin IDs (to confirm binning) 84 | 85 | 31may2019 86 | - added -fast- option to use fast commands from -gtools- (-gcollapse- instead of 87 | official -collapse- for aggregation; -gegen- functions to handle categorical 88 | variables instead of -bysort-) 89 | - added faster code to write Mata matrix to data if full matrix is used 90 | 91 | 25may2019 92 | - there was a bug in how the intervals were computed if cuts() was specified 93 | and did not contain @min or @max and the specified minimum cut was larger than 94 | min of data or specified maximum cut wqs smaller than max of data 95 | 96 | 20may2019 97 | - option -srange()- added 98 | -------------------------------------------------------------------------------- /heatplot.ado: -------------------------------------------------------------------------------- 1 | *! version 1.1.1 24aug2021 Ben Jann 2 | 3 | capt which colorpalette 4 | if _rc { 5 | di as err "-colorpalette- is required; type {stata ssc install palettes, replace}" 6 | if c(stata_version)>=14.2 { 7 | capt findfile lcolrspace.mlib 8 | if _rc { 9 | di as error "-colrspace- is required; type {stata ssc install colrspace, replace}" 10 | } 11 | } 12 | exit 499 13 | } 14 | 15 | program heatplot, rclass 16 | version 13 17 | 18 | // some tempnames for scalars 19 | tempname CUTS MIN MAX 20 | local scalars y_K y_MIN y_MAX y_LB y_UB y_WD x_K x_MIN x_MAX x_LB x_UB x_WD 21 | tempname `scalars' 22 | foreach scalar of local scalars { 23 | scalar ``scalar'' = . 24 | } 25 | 26 | // syntax 27 | // - list of common options 28 | local zopts LEVels(int 0) CUTs(str) Colors(str asis) NORMalize /// 29 | size SIZE2(str asis) srange(numlist max=2 >=0) TRANSform(str asis) /// 30 | MISsing MISsing2(str) VALues VALues2(str asis) /// 31 | HEXagon HEXagon2(str asis) /// 32 | scatter SCATTER2(str asis) KEYlabels(str asis) p(str) /// 33 | RAMP RAMP2(str asis) BACKFill BACKFill2(str) 34 | local yxopts /// 35 | bins(str) BWidth(str) DISCRete DISCRete2(numlist max=1) /// 36 | xbins(str) XBWidth(str) XDISCRete XDISCRete2(numlist max=1) /// 37 | ybins(str) YBWidth(str) YDISCRete YDISCRete2(numlist max=1) /// 38 | clip lclip rclip tclip bclip /// 39 | BCuts(numlist ascending min=2) XBCuts(numlist ascending min=2) /// 40 | YBCuts(numlist ascending min=2) 41 | local matopts lower upper noDIAGonal drop(numlist) 42 | local gopts noGRaph addplot(str asis) ADDPLOTNOPReserve /// 43 | GENerate GENerate2(str) Replace noPREServe /// 44 | YAXis(passthru) XAXis(passthru) * 45 | _parse comma matrix rest : 0 46 | capt _parse_mata, `matrix' 47 | if _rc==0 { // syntax 2: heatplot mata(M) 48 | local syntax 2 49 | local 0 `"`rest'"' 50 | syntax [, `zopts' Statistic(name) fast `yxopts' `matopts' noLabel `gopts' ] 51 | capt mata mata describe `mata' 52 | if _rc { 53 | di as err `"Mata matrix `mata' not found"' 54 | exit _rc 55 | } 56 | } 57 | else if `: list sizeof matrix'==1 { // syntax 3: heatplot matrix 58 | local syntax 3 59 | syntax [anything(name=matrix)] [, `zopts' `matopts' /// 60 | EQuations EQuations2(str) Label `gopts' ] 61 | confirm matrix `matrix' 62 | if `"`equations2'"'!="" local equations equations 63 | } 64 | else { // syntax 1: heatplot [z] y x 65 | local syntax 1 66 | syntax varlist(min=2 max=3 fv) [if] [in] [aw fw iw pw] [, /// 67 | `zopts' Statistic(name) fast SIZEProp RECenter /// 68 | `yxopts' FILLin(numlist max=2 missingok) noLabel /// 69 | idgenerate(str) /// undocumented 70 | `gopts' ] 71 | } 72 | // - handle hexagon option (must do this first) 73 | if `"`hexagon2'"'!="" local hexagon hexagon 74 | if "`hexagon'"!="" { 75 | _parse_hex, `hexagon2' // returns hexdir hexorder hexodd 76 | } 77 | // - collect variables 78 | if `syntax'==1 { 79 | gettoken z0 y0 : varlist 80 | gettoken y0 x0 : y0 81 | gettoken x0 : x0 82 | if `"`x0'"'=="" { 83 | local x0 `"`y0'"' 84 | local y0 `"`z0'"' 85 | local z0 86 | } 87 | if "`hexdir'"=="1" { // flip variables 88 | local tmp `x0' 89 | local x0 `y0' 90 | local y0 `tmp' 91 | } 92 | if "`z0'"!="" { 93 | capt confirm numeric variable `z0' 94 | if _rc==101 { 95 | di as err `"'`z0'' found where numeric variable expected"' 96 | exit 7 97 | } 98 | else { 99 | confirm numeric variable `z0' 100 | } 101 | } 102 | } 103 | // - handle discrete() 104 | if inlist(`syntax',1,2) { 105 | if "`discrete2'"!="" local discrete discrete 106 | if "`discrete2'"=="" local discrete2 1 // default width: 1 unit 107 | foreach v in x y { 108 | if "`discrete'"!="" local `v'discrete `v'discrete 109 | if "``v'discrete2'"!="" local `v'discrete `v'discrete 110 | if "``v'discrete2'"=="" local `v'discrete2 `discrete2' 111 | } 112 | } 113 | // - handle categorical variable (i. or string) 114 | if `syntax'==1 { 115 | foreach v in x y { 116 | if substr("``v'0'",1,2)=="i." { 117 | local `v'0 = substr("``v'0'",3,.) 118 | local `v'cat `v'cat 119 | } 120 | else { 121 | capt confirm numeric variable ``v'0' 122 | if _rc local `v'cat `v'cat 123 | } 124 | if "``v'cat'``v'discrete'"!="" { 125 | if "``v'cat'"!="" local `v'discrete "" 126 | scalar ``v'_WD' = ``v'discrete2' 127 | } 128 | } 129 | } 130 | // - handle z-options: levels(), cuts(), missing(), values(), keyabels() 131 | if `"`cuts'"'!="" { 132 | if `levels'>0 { 133 | di as err "only one of levels() and cuts() allowed" 134 | exit 198 135 | } 136 | _check_cuts `CUTS', cuts(`cuts') 137 | } 138 | if `"`missing2'"'!="" local missing missing 139 | if "`missing'"!="" _parse_missing, `missing2' 140 | if `"`values2'"'!="" local values values 141 | if "`values'"!="" _parse_values, `values2' 142 | if `"`ramp2'"'!="" local ramp ramp 143 | if "`ramp'"!="" { 144 | if `"`keylabels'"'!="" { 145 | di as err "ramp() and keylabels() not both allowed" 146 | exit 198 147 | } 148 | tempvar ramp_ID ramp_Y ramp_X 149 | _parse_ramp `ramp_Y' `ramp_X', `ramp2' 150 | } 151 | else { 152 | _parse_keylab `keylabels' 153 | } 154 | if `"`backfill2'"'!="" local backfill backfill 155 | if "`backfill'"!="" _parse_backfill, `backfill2' 156 | // - handle bins() 157 | if inlist(`syntax',1,2) { 158 | if `"`bins'`bwidth'`bcuts'"'!="" { 159 | if `"`bcuts'"'!="" & "`hexagon'"!="" { 160 | di as err "bcuts() not allowed together with hexagon" 161 | exit 198 162 | } 163 | if "`xcat'`xdiscrete'"!="" & "`ycat'`ydiscrete'"!="" & { 164 | di as err "bins/bwidth/bcuts() not allowed with categorical/discrete variables" 165 | exit 198 166 | } 167 | if ((`"`bins'"'!="") + (`"`bwidth'"'!="") + (`"`bcuts'"'!=""))>1 { 168 | di as err "only one of bins(), bwidth() and bcuts() allowed" 169 | exit 198 170 | } 171 | } 172 | foreach v in x y { 173 | if `"``v'bins'``v'bwidth'``v'bcuts'"'!="" { 174 | if `"``v'bcuts'"'!="" & "`hexagon'"!="" { 175 | di as err "`v'bcuts() not allowed together with hexagon" 176 | exit 198 177 | } 178 | if "``v'cat'``v'discrete'"!="" { 179 | di as err "`v'bins/bwith/bcuts() not allowed with categorical/discrete `v'" 180 | exit 198 181 | } 182 | if ((`"``v'bins'"'!="") + (`"``v'bwidth'"'!="") + (`"``v'bcuts'"'!=""))>1 { 183 | di as err "only one of `v'bins(), `v'bwidth() and `v'bcuts() allowed" 184 | exit 198 185 | } 186 | if `"``v'bins'"'!="" { 187 | _parse_bins `v' 0 ``v'_K' ``v'_LB' ``v'_UB' ``v'_WD' ``v'bins' 188 | } 189 | else if `"``v'bcuts'"'!="" { 190 | _parse_bcuts `v' ``v'_K' ``v'_LB' ``v'_UB' ``v'_WD' "``v'bcuts'" 191 | } 192 | else { 193 | _parse_bins `v' 1 ``v'_K' ``v'_LB' ``v'_UB' ``v'_WD' ``v'bwidth' 194 | } 195 | } 196 | else if "``v'cat'``v'discrete'"=="" { 197 | if `"`bins'"'!="" { 198 | _parse_bins `v' 0 ``v'_K' ``v'_LB' ``v'_UB' ``v'_WD' `bins' 199 | } 200 | else if `"`bcuts'"'!="" { 201 | _parse_bcuts `v' ``v'_K' ``v'_LB' ``v'_UB' ``v'_WD' "`bcuts'" 202 | } 203 | else { 204 | _parse_bins `v' 1 ``v'_K' ``v'_LB' ``v'_UB' ``v'_WD' `bwidth' 205 | } 206 | } 207 | } 208 | } 209 | 210 | // - clip option 211 | if "`clip'"!="" { 212 | local xclip clip 213 | local yclip clip 214 | } 215 | else { 216 | if "`rclip'"!="" & "`lclip'"!="" local xclip clip 217 | else if "`rclip'"!="" local xclip rclip 218 | else if "`lclip'"!="" local xclip lclip 219 | if "`tclip'"!="" & "`bclip'"!="" local yclip clip 220 | else if "`tclip'"!="" local yclip rclip 221 | else if "`bclip'"!="" local yclip lclip 222 | } 223 | if "`hexdir'"=="1" { 224 | local tmp `xclip' 225 | local xclip `yclip' 226 | local yclip `tmp' 227 | } 228 | // - handle recenter 229 | if `syntax'==1 { 230 | if "`recenter'"!="" { 231 | if "`xcat'`xdiscrete'"=="" local xrecenter xrecenter 232 | if "`ycat'`ydiscrete'"=="" local yrecenter yrecenter 233 | } 234 | } 235 | // - handle statistic() and size() 236 | _parse_size2 `size2' 237 | if `"`size2'"'!="" local size size 238 | if `syntax'==1 { 239 | if `"`statistic'"'=="" { 240 | if "`z0'"=="" local statistic "percent" 241 | else local statistic "mean" 242 | } 243 | else if `"`statistic'"'=="asis" & "`z0'"=="" { 244 | di as err "statistic(asis) only allowed if z variable is specified" 245 | exit 198 246 | } 247 | if (("`size'"!="") + ("`sizeprop'"!=""))>1 { 248 | di as err "only one of size() and sizeprop allowed" 249 | exit 198 250 | } 251 | } 252 | else if `syntax'==2 { 253 | if `"`statistic'"'=="" { 254 | if "`ydiscrete'"!="" & "`xdiscrete'"!="" local statistic asis 255 | else local statistic sum 256 | } 257 | } 258 | else /*syntax==3*/ local statistic asis 259 | // - handle scatter option 260 | if `"`scatter2'"'!="" local scatter scatter 261 | if "`scatter'"!="" { 262 | if "`hexagon'"!="" { 263 | di as err "hexagon and scatter not both allowed" 264 | exit 198 265 | } 266 | if `"`scatter2'"'!="" { 267 | _symbolpalette `scatter2' 268 | local scatter2 `"`r(p)'"' 269 | } 270 | if `"`scatter2'"'=="" local scatter2 O 271 | } 272 | // - handle lower/upper 273 | if inlist(`syntax',2,3) { 274 | if "`upper'"!="" & "`lower'"!="" { 275 | di as err "upper and lower not both allowed" 276 | exit 198 277 | } 278 | } 279 | // - handle graph options, including p#() 280 | if `syntax'==1 { 281 | _get_gropts, graphopts(`options') gettwoway grbyable getbyallowed(LEGend) missingallowed 282 | local by "`s(varlist)'" 283 | if "`by'"!="" { 284 | local bymissing "`s(missing)'" 285 | _parse_bylegend, `s(by_legend)' `ramp' // returns bylegend 286 | local byopt by(`by', `byopt' `bymissing' `bylegend' `s(byopts)') 287 | } 288 | } 289 | else { 290 | _get_gropts, graphopts(`options') gettwoway 291 | } 292 | local options `s(twowayopts)' 293 | _parse_popts `s(graphopts)' 294 | _check_gropts, `graphopts' 295 | local options `options' `graphopts' 296 | local AXIS `yaxis' `xaxis' // need to pass through to each plot 297 | if "`ramp'"!="" { 298 | _parse_ramp_gropts, `options' 299 | } 300 | // generate option 301 | _parse_generate `generate2' 302 | if "`generate2'"!="" local generate generate 303 | if "`generate'"!="" { 304 | foreach v0 in _Z _Zid _Y _Yshape _X _Xshape _Size _Mlab { 305 | gettoken v generate2 : generate2 306 | if `"`v'"'=="" local v `v0' 307 | if "`replace'`preserve'"=="" confirm new variable `v' 308 | local vnames `vnames' `v' 309 | } 310 | } 311 | 312 | // prepare data 313 | // - preserve 314 | if `"`idgenerate'"'!="" { // undocumented 315 | tempvar ID 316 | gen double `ID' = _n 317 | } 318 | local no_dataset_in_use = (c(k)==0) 319 | preserve 320 | qui count 321 | local N0 = r(N) 322 | // - write matrix to data 323 | if inlist(`syntax',2,3) { 324 | tempvar x0 y0 z0 325 | qui gen double `z0' = . 326 | qui gen double `y0' = . 327 | qui gen double `x0' = . 328 | if `"`size2'"'!="" { 329 | confirm name `size2' 330 | if `:list sizeof size2'>1 { 331 | di as err `"'`size2'' found where name expected"' 332 | exit 7 333 | } 334 | if `syntax'==2 { 335 | capt mata mata describe `size2' 336 | if _rc { 337 | di as err `"Mata matrix `size2' not found"' 338 | exit _rc 339 | } 340 | } 341 | else { // syntax=3 342 | confirm matrix `size2' 343 | } 344 | tempvar z2 345 | qui gen `z2' = . 346 | } 347 | if `"`valuesexp'"'!="" { 348 | confirm name `valuesexp' 349 | if `:list sizeof valuesexp'>1 { 350 | di as err `"'`valuesexp'' found where name expected"' 351 | exit 7 352 | } 353 | if `syntax'==2 { 354 | capt mata mata describe `valuesexp' 355 | if _rc { 356 | di as err `"Mata matrix `valuesexp' not found"' 357 | exit _rc 358 | } 359 | } 360 | else { // syntax=3 361 | confirm matrix `valuesexp' 362 | } 363 | tempvar z3 364 | qui gen `z3' = . 365 | } 366 | if `syntax'==2 { 367 | if `"`size2'"'=="" local size2 J(0,0,.) 368 | if `"`valuesexp'"'=="" local valuesexp J(0,0,.) 369 | mata: writematamatrixtodata(`mata', `size2', `valuesexp') 370 | } 371 | else { // syntax=3 372 | mata: writematrixtodata("`matrix'", `"`size2'"', `"`valuesexp'"') 373 | } 374 | local size2 375 | local valuesexp 376 | if "`hexdir'"=="1" { // flip variables 377 | local tmp `x0' 378 | local x0 `y0' 379 | local y0 `tmp' 380 | } 381 | } 382 | // - select sample 383 | marksample touse, novarlist 384 | markout `touse' `y0' `x0', strok 385 | if "`missing'"=="" { 386 | if "`z0'"!="" markout `touse' `z0' 387 | } 388 | if `syntax'==1 { 389 | if "`by'"!="" & "`bymissing'"=="" markout `touse' `by', strok 390 | } 391 | if `"`size2'"'!="" { 392 | tempvar z2 393 | qui gen double `z2' = (`size2') if `touse' 394 | } 395 | if `"`valuesexp'"'!="" { 396 | tempvar z3 397 | qui gen `z3' = (`valuesexp') if `touse' 398 | } 399 | qui keep if `touse' 400 | // - handle weights and count observations 401 | if `syntax'==1 { 402 | if "`weight'"!="" { 403 | tempvar w 404 | qui gen double `w' `exp' 405 | local wgt "[`weight' = `w']" 406 | if "`weight'"=="pweight" local swgt "[aw = `w']" 407 | else local swgt "[`weight' = `w']" 408 | su `touse' `swgt', meanonly 409 | local N = r(N) 410 | } 411 | else { 412 | qui count 413 | local N = r(N) 414 | } 415 | } 416 | else { 417 | qui count 418 | local N = r(N) 419 | } 420 | // - drop all irrelevant variables 421 | keep `y0' `x0' `z0' `z2' `z3' `w' `by' `ID' 422 | 423 | // collect titles for axes and legend 424 | if `syntax'==1 { 425 | if "`xcat'"=="" | `"`: value label `x0''"'=="" | "`label'"!="" { 426 | if "`label'"=="" local xtitle: var lab `x0' 427 | if `"`xtitle'"'=="" local xtitle `x0' 428 | } 429 | if "`ycat'"=="" | `"`: value label `y0''"'=="" | "`label'"!="" { 430 | if "`label'"=="" local ytitle: var lab `y0' 431 | if `"`ytitle'"'=="" local ytitle `y0' 432 | } 433 | } 434 | else if `syntax'==2 { 435 | local xtitle Columns 436 | local ytitle Rows 437 | } 438 | if "`z0'"=="" local ztitle "`statistic'" 439 | else { 440 | if `syntax'==2 { 441 | if "`statistic'"!="asis" local ztitle "`statistic'" 442 | else local ztitle "`mata'" 443 | } 444 | else if `syntax'==3 local ztitle "`matrix'" 445 | else local ztitle "`z0'" 446 | } 447 | if `"`transform'"'!="" & `"`retransform'"'=="" { 448 | local ztitle: subinstr local transform "@" `"`ztitle'"', all 449 | } 450 | 451 | // make bins of x and y 452 | if `"`idgenerate'"'!="" { // undocumented 453 | gettoken IDY IDX : idgenerate 454 | gettoken IDX : IDX 455 | } 456 | tempname x y 457 | if inlist(`syntax',1,2) { 458 | foreach v in x y { 459 | if "``v'cat'"!="" /// 460 | _makebin_categorical "`fast'" `v' ``v'' ``v'0' ``v'_K' /// 461 | ``v'_LB' ``v'_MIN' ``v'_UB' ``v'_MAX' "`label'" 462 | else if "``v'discrete'"!="" { 463 | if `syntax'==1 _makebin_discrete ``v'' ``v'0' ``v'_K' /// 464 | ``v'_LB' ``v'_MIN' ``v'_UB' ``v'_MAX' ``v'_WD' 465 | else { 466 | rename ``v'0' ``v'' 467 | if "``v'discrete2'"!="" scalar ``v'_WD' = ``v'discrete2' 468 | } 469 | } 470 | else if "`hexagon'"=="" { 471 | if "``v'cuts'"!="" { 472 | _makebin_cuts `v' ``v'' ``v'0' ``v'_K' ``v'_LB' /// 473 | ``v'_MIN' ``v'_UB' ``v'_MAX' ``v'_WD' "``v'cuts'" 474 | local `v'WDvar ``v'_WD' 475 | } 476 | else { 477 | _makebin_continuous `v' ``v'' ``v'0' ``v'_K' ``v'_LB' /// 478 | ``v'_MIN' ``v'_UB' ``v'_MAX' ``v'_WD' "``v'tight'" /// 479 | "``v'clip'" "`swgt'" 480 | } 481 | } 482 | else /// hexagon 483 | _hexbin_prepare `v' ``v'' ``v'0' ``v'_K' ``v'_LB' /// 484 | ``v'_MIN' ``v'_UB' ``v'_MAX' ``v'_WD' "``v'tight'" /// 485 | `hexodd' "`swgt'" 486 | } 487 | } 488 | else { 489 | rename `x0' `x' 490 | rename `y0' `y' 491 | } 492 | if "`hexagon'"!="" { 493 | if "`xrecenter'"!="" clonevar `x0' = `x' 494 | if "`yrecenter'"!="" clonevar `y0' = `y' 495 | _hexbin `x' `x_LB' `x_UB' `x_WD' `x_MIN' `x_MAX' "`xtight'" "`xclip'" "`xdiscrete'" /// 496 | `y' `y_LB' `y_UB' `y_WD' `y_MIN' `y_MAX' "`ytight'" "`yclip'" "`ydiscrete'" /// 497 | `hexorder' 498 | } 499 | if `"`idgenerate'"'!="" { // undocumented 500 | gettoken IDY IDX : idgenerate 501 | gettoken IDX : IDX 502 | sort `y' 503 | qui gen `IDY' = sum(`y'!=`y'[_n-1]) if `y'<. 504 | sort `x' 505 | qui gen `IDX' = sum(`x'!=`x'[_n-1]) if `x'<. 506 | sort `ID' 507 | tempfile idgentmp 508 | qui save `idgentmp' 509 | restore 510 | capt drop `IDY' 511 | capt drop `IDX' 512 | qui merge 1:1 `ID' using `idgentmp', nogenerate 513 | preserve 514 | } 515 | 516 | // aggregate outcome and handle transformation and size variable 517 | if "`sizeprop'"!="" { 518 | tempvar z2 519 | if "`w'"!="" qui gen double `z2' = `w' 520 | else qui gen double `z2' = 1 521 | } 522 | if `"`statistic'"'!="asis" { 523 | if "`statistic'"==substr("proportion", 1, max(2, strlen("`statistic'"))) { 524 | local statistic0 proportion 525 | local statistic percent 526 | } 527 | else if "`statistic'"==substr("density", 1, max(4, strlen("`statistic'"))) { 528 | local statistic0 density 529 | local statistic percent 530 | } 531 | if "`z0'"=="" { 532 | tempvar z0 533 | qui gen double `z0' = 1 534 | } 535 | if "`sizeprop'"!="" local z2stat (percent) `z2' 536 | else if "`z2'"!="" local z2stat (`sizestat') `z2' 537 | if "`z3'"!="" { 538 | if `"`valuestat'"'=="" { 539 | if substr("`:type `z3''",1,3)=="str" local valuestat first 540 | else local valuestat mean 541 | } 542 | local z3stat (`valuestat') `z3' 543 | } 544 | if "`xrecenter'"!="" local xrcstat (mean) `x0' 545 | if "`yrecenter'"!="" local yrcstat (mean) `y0' 546 | if "`xWDvar'`yWDvar'"!="" { // xbcuts or ybcuts 547 | local WDstat (last) 548 | if "`xWDvar'"!="" local WDstat `WDstat' `x_WD' 549 | if "`yWDvar'"!="" local WDstat `WDstat' `y_WD' 550 | } 551 | if "`fast'"!="" local collapse gcollapse // requires gtools 552 | else local collapse collapse 553 | `collapse' (`statistic') `z0' `z2stat' `z3stat' `xrcstat' `yrcstat' /// 554 | `WDstat' `wgt', fast by(`by' `y' `x') 555 | if inlist("`statistic0'", "proportion", "density") { 556 | qui replace `z0' = `z0' / 100 557 | local statistic `statistic0' 558 | } 559 | } 560 | qui drop if `x'>=. | `y'>=. // no longer needed (missings were relevant 561 | // only for computing percentages by collapse) 562 | 563 | // compute areas in case of clip 564 | if "`normalize'"!="" | "`statistic'"=="density" { 565 | tempname AREA 566 | qui gen double `AREA' = `y_WD' * `x_WD' 567 | if "`yclip'`xclip'"!="" mata: cliparea("`hexagon'"!="") 568 | qui replace `z0' = `z0' / (`AREA') 569 | drop `AREA' 570 | } 571 | 572 | // z: rename (to prevent name conflict later on) 573 | tempname z 574 | rename `z0' `z' 575 | 576 | // apply transform 577 | if `"`transform'"'!="" { 578 | local transform: subinstr local transform "@" "`z'", all 579 | qui replace `z' = `transform' 580 | } 581 | 582 | // handle size 583 | if "`size'"!="" { 584 | if "`z2'"=="" { 585 | tempvar z2 586 | qui gen double `z2' = abs(`z') 587 | qui replace `z2' = 0 if `z2'>=. 588 | } 589 | else if "`sizeprop'"=="" { 590 | qui replace `z2' = abs(`z2') 591 | qui replace `z2' = 0 if `z2'>=. 592 | } 593 | } 594 | if "`z2'"!="" { 595 | if "`scatter'"=="" { 596 | gettoken size_min size_max : srange 597 | gettoken size_max : size_max 598 | su `z2', meanonly 599 | if "`size_min'"=="" { 600 | qui replace `z2' = sqrt(`z2'/r(max)) 601 | } 602 | else { 603 | if "`size_max'"=="" local size_max 1 604 | qui replace `z2' = sqrt(`z2'/r(max) * /// 605 | (`size_max'-`size_min') + `size_min') 606 | } 607 | } 608 | } 609 | 610 | // handle values 611 | if "`values'"!="" { 612 | if "`z3'"=="" { 613 | tempvar z3 614 | qui gen `z3' = `z' 615 | } 616 | if `"`valuestrans'"'!="" { 617 | // generate new variable so that result can be numeric or string 618 | tempvar tmp 619 | local valuestrans: subinstr local valuestrans "@" "`z3'", all 620 | qui gen `tmp' = (`valuestrans') 621 | local z3 "`tmp'" 622 | } 623 | if `"`valuesfmt'"'!="" format `valuesfmt' `z3' 624 | } 625 | 626 | // fillin 627 | _fillin "`fillin'" `x' `x_K' `x_LB' `x_WD' "`xtight'" "`xcat'`xdiscrete'" "`xcuts'" /// 628 | `y' `y_K' `y_LB' `y_WD' "`ytight'" "`ycat'`ydiscrete'" "`ycuts'" /// 629 | `z' "`z2'" "`by'" "`hexagon'" `hexorder' `hexodd' 630 | 631 | // determine range of data and set levels if cuts contains @min/@max 632 | su `z', meanonly 633 | scalar `MIN' = r(min) 634 | scalar `MAX' = r(max) 635 | if "`cuts'"!="" { 636 | _parse_cuts `CUTS' `MIN' `MAX' `"`cuts'"' 637 | } 638 | 639 | // if cuts() has been specified: add intervals at bottom and top, if needed 640 | capt confirm matrix `CUTS' 641 | if _rc==0 { 642 | if `MIN'<`CUTS'[1,1] { 643 | local ++levels 644 | matrix `CUTS' = J(1, `levels'+1, .) \ (`MIN', `CUTS') 645 | matrix `CUTS' = `CUTS'[2,1...] // so that colnames are correct 646 | } 647 | if `MAX'>`CUTS'[1,`levels'+1] { 648 | local ++levels 649 | matrix `CUTS' = `CUTS', `MAX' 650 | } 651 | } 652 | 653 | // get colors 654 | _parse comma colors coloropts : colors 655 | if `"`colors'"'=="" { 656 | if c(stata_version)<14.2 { 657 | if `"`coloropts'"'=="" local colors "hcl, viridis" 658 | else local colors `"hcl`coloropts'"' 659 | } 660 | else local colors `"viridis`coloropts'"' 661 | } 662 | else { 663 | local colors `"`colors'`coloropts'"' 664 | } 665 | if `levels'==0 { 666 | _colorpalette `levels' `colors' 667 | if `"`backfill'"'!="" local levels = r(n) - 1 668 | else local levels = r(n) 669 | } 670 | else { 671 | if `"`backfill'"'!="" _colorpalette `=`levels'+1' `colors' 672 | else _colorpalette `levels' `colors' 673 | } 674 | local colors `"`r(p)'"' 675 | if `"`backfill'"'!="" { 676 | if `"`backfill_last'"'=="" { 677 | gettoken backfill_color colors0 : colors, quotes 678 | gettoken colors colors0 : colors0, quotes 679 | while (1) { 680 | gettoken color colors0 : colors0, quotes 681 | if `"`color'"'=="" continue, break 682 | local colors `"`colors' `color'"' 683 | } 684 | } 685 | else { 686 | gettoken colors colors0 : colors, quotes 687 | while (1) { 688 | gettoken color colors0 : colors0, quotes 689 | if `"`colors0'"'=="" { // last color 690 | local backfill_color `"`color'"' 691 | continue, break 692 | } 693 | local colors `"`colors' `color'"' 694 | } 695 | } 696 | } 697 | if "`missing'"!="" & "`ramp'"=="" { 698 | local legend - " " 1 `missing_label' 699 | } 700 | 701 | // set cuts 702 | capt confirm matrix `CUTS' 703 | if _rc { 704 | matrix `CUTS' = J(1, `levels', .) 705 | matrix `CUTS'[1,1] = `MIN' 706 | forv i = 2/`levels' { 707 | matrix `CUTS'[1,`i'] = `MIN' + (`i'-1) * (`MAX' - `MIN') / `levels' 708 | } 709 | matrix `CUTS' = `CUTS', `MAX' 710 | } 711 | 712 | // prepare legend 713 | if "`keylabels'"=="" & "`ramp'"=="" { 714 | if `levels' <= 24 local keylabels "all" 715 | else { 716 | local keylabels = ceil(`levels'/24) 717 | numlist "`=1+`keylabels''(`keylabels')`=`levels'-`keylabels''" 718 | local keylabels "1 `r(numlist)' `levels'" 719 | } 720 | } 721 | 722 | // categorize outcome and collect legend keys 723 | tempvar Z 724 | qui gen byte `Z' = .z 725 | qui replace `Z' = 1 if `z'<. // (set first bin) 726 | local ul = `CUTS'[1,1] 727 | forv i = 1/`levels' { 728 | // categorize (skip first bin) 729 | if `i'>1 { 730 | qui replace `Z' = `Z' + (`z'>=`CUTS'[1,`i']) if `z'<. 731 | } 732 | if "`ramp'"!="" continue 733 | // key labels 734 | local ll `ul' 735 | local ul = `CUTS'[1,`i'+1] 736 | if "`keylabels'"=="none" local keylab 737 | else if "`keylabels'"=="all" local keylab ok 738 | else if `:list posof "`i'" in keylabels' local keylab ok 739 | else local keylab 740 | if "`keylab'"!="" { 741 | if "`keylab_interval'`keylab_range'"=="" { 742 | local keylab = (`ll'+`ul')/2 743 | if `"`retransform'"'!="" { 744 | local keylab: subinstr local retransform "@" "`keylab'", all 745 | } 746 | local keylab `: di `keylab_format' `keylab'' 747 | } 748 | else { 749 | local ll0 `ll' 750 | local ul0 `ul' 751 | if `"`retransform'"'!="" { 752 | local ll0: subinstr local retransform "@" "`ll0'", all 753 | local ul0: subinstr local retransform "@" "`ul0'", all 754 | } 755 | local ll0 `:di `keylab_format' `ll0'' 756 | if "`keylab_interval'"!="" { 757 | local ul0 `:di `keylab_format' `ul0'' 758 | if `i'<`levels' local keylab "[`ll0', `ul0')" 759 | else local keylab "[`ll0', `ul0']" 760 | } 761 | else { 762 | if `i'<`levels' local ul0 = `ul0' - `keylab_range' 763 | local ul0 `:di `keylab_format' `ul0'' 764 | local keylab "`ll0'-`ul0'" 765 | } 766 | } 767 | } 768 | local ii = `i' + ("`missing'"!="") 769 | local legend `ii' "`keylab'" `legend' 770 | if "`keylabels'"=="minmax" { 771 | if `i'==1 { 772 | local keylab `:di `keylab_format' `ll'' 773 | local legend `legend' - "`keylab'" 774 | } 775 | else if `i'==`levels' { 776 | local keylab `:di `keylab_format' `ul'' 777 | local legend - "`keylab'" `legend' 778 | } 779 | } 780 | } 781 | 782 | // expand data 783 | tempvar X Y 784 | if "`scatter'"=="" { 785 | tempvar id 786 | qui gen `id' = _n 787 | if "`hexagon'"!="" { 788 | if "`yclip'`xclip'"!="" { 789 | qui expand 9 790 | sort `id' 791 | qui gen double `X' = . 792 | qui gen double `Y' = . 793 | mata: fillinhexcoords() 794 | } 795 | else { 796 | qui expand 7 797 | sort `id' 798 | qui by `id': gen double `X' = cond(inlist(_n,5,6), -1, 1) * /// 799 | cond(inlist(_n,1,4), 0, `x_WD'/2) if _n<7 800 | qui by `id': gen double `Y' = cond(inlist(_n,1,2,6), -1, 1) / /// 801 | cond(inlist(_n,1,4), 1, 2) * `y_WD' * (2/3) if _n<7 802 | } 803 | } 804 | else { 805 | qui expand 5 806 | sort `id' 807 | qui by `id': gen double `X' = cond(inlist(_n,1,2), -`x_WD'/2, `x_WD'/2) if _n<5 808 | qui by `id': gen double `Y' = cond(inlist(_n,1,4), -`y_WD'/2, `y_WD'/2) if _n<5 809 | if "`yclip'`xclip'"!="" { 810 | if inlist("`yclip'", "clip", "lclip") { 811 | qui replace `Y' = max(`y'+`Y', `y_LB') - `y' if `Y'<. 812 | } 813 | if inlist("`yclip'", "clip", "rclip") { 814 | qui replace `Y' = min(`y'+`Y', `y_UB') - `y' if `Y'<. 815 | } 816 | if inlist("`xclip'", "clip", "lclip") { 817 | qui replace `X' = max(`x'+`X', `x_LB') - `x' if `X'<. 818 | } 819 | if inlist("`xclip'", "clip", "rclip") { 820 | qui replace `X' = min(`x'+`X', `x_UB') - `x' if `X'<. 821 | } 822 | } 823 | } 824 | if "`z2'"!="" { 825 | qui replace `X' = `X'*`z2' 826 | qui replace `Y' = `Y'*`z2' 827 | } 828 | } 829 | if "`xrecenter'"!="" { 830 | qui replace `x' = `x0' if `x0'<. 831 | drop `x0' 832 | } 833 | if "`yrecenter'"!="" { 834 | qui replace `y' = `y0' if `y0'<. 835 | drop `y0' 836 | } 837 | if "`scatter'"=="" { 838 | qui replace `X' = `x' + `X' 839 | qui replace `Y' = `y' + `Y' 840 | qui by `id': replace `x' = . if _n!=1 841 | qui by `id': replace `y' = . if _n!=1 842 | qui by `id': replace `z' = . if _n!=1 843 | drop `id' 844 | } 845 | else { 846 | qui gen `X' = . 847 | qui gen `Y' = . 848 | } 849 | 850 | // addplot: get original data back in 851 | if `"`addplot'"'!="" & "`addplotnopreserve'"=="" { 852 | tempfile plotdata 853 | if "`by'"!="" { 854 | tempvar byindex sortindex 855 | sort `by' 856 | qui by `by': gen double `byindex' = _n 857 | qui save `"`plotdata'"', replace 858 | restore, preserve 859 | qui gen double `sortindex' = _n 860 | sort `by' 861 | qui by `by': gen double `byindex' = _n 862 | qui merge 1:1 `by' `byindex' using `"`plotdata'"', /// 863 | keep(match master using) nogenerate 864 | sort `sortindex' 865 | drop `byindex' `sortindex' 866 | } 867 | else if !`no_dataset_in_use' { 868 | qui save `"`plotdata'"', replace 869 | restore, preserve 870 | qui merge 1:1 _n using `"`plotdata'"', /// 871 | keep(match master using) nogenerate 872 | } 873 | capt erase `"`plotdata'"' 874 | } 875 | 876 | // swap x and y if hexdir==1 877 | if "`hexdir'"=="1" mata: swapxy() 878 | 879 | // compile plot 880 | local plots 881 | if "`scatter'"=="" | "`keylab_area'"!="" { 882 | // in case of scatter: include area plots to create legend 883 | if "`missing'"!="" { 884 | local plots (area `Y' `X' if `Z'==.z, `AXIS' nodropbase/* 885 | */ cmissing(n) color(black) finten(100) /*`lopts' */`missing2' ) 886 | } 887 | local clist 888 | forv i = 1/`levels' { 889 | gettoken c clist : clist 890 | if `"`c'"'=="" { // recycle 891 | gettoken c clist : colors 892 | } 893 | local plots `plots' (area `Y' `X' if `Z'==`i', `AXIS' nodropbase/* 894 | */ cmissing(n) color("`c'") finten(100) `p' `p`i'') 895 | } 896 | } 897 | if "`scatter'"!="" { 898 | if "`missing'"!="" { 899 | local plots `plots' (scatter `y' `x' if `Z'==.z, `AXIS'/* 900 | */ ms(X) color(black) `missing2') 901 | } 902 | if "`z2'"!="" { 903 | local zwgt [aw = `z2'] 904 | forv i = 1/`levels' { 905 | tempname y`i' 906 | qui gen `y`i'' = `y' if `Z'==`i' 907 | } 908 | } 909 | local clist 910 | local mslist 911 | forv i = 1/`levels' { 912 | gettoken c clist : clist 913 | if `"`c'"'=="" { // recycle 914 | gettoken c clist : colors 915 | } 916 | gettoken ms mslist : mslist 917 | if `"`ms'"'=="" { // recycle 918 | gettoken ms mslist : scatter2 919 | } 920 | if "`z2'"!="" local tmp `y`i'' `x' [aw = `z2'] 921 | else local tmp `y' `x' if `Z'==`i' 922 | local plots `plots' (scatter `tmp', `AXIS' ms(`ms') color("`c'")/* 923 | */ `p' `p`i'') 924 | } 925 | } 926 | if "`equations'"!="" { 927 | local plots `plots' (scatteri `eqcoords', `AXIS' recast(area)/* 928 | */ nodropbase cmissing(n) fcolor(none) lstyle(xyline)/* 929 | */ lalign(center) `equations2') 930 | } 931 | if "`values'"!="" { 932 | local plots `plots' (scatter `y' `x' if `z'<., `AXIS' mlabel(`z3')/* 933 | */ `values2') 934 | } 935 | if `"`addplot'"'!="" { 936 | local plots `plots' || `addplot' || 937 | } 938 | if "`ramp'"=="" { 939 | if `"`keylab_order'"'=="" local keylab_order order(`legend') 940 | if `"`keylab_pos'"'=="" & "`by'"=="" local keylab_pos position(3) 941 | local legendopt legend(subtitle(`ztitle', size(medsmall))/* 942 | */ `keylab_order' `keylab_pos' `keylab_opts') 943 | } 944 | if `syntax'==3 { 945 | local yscale yscale(reverse) 946 | local yhor ylabel(, angle(0)) 947 | } 948 | else if `syntax'==2 { 949 | local yscale yscale(reverse) 950 | local xscale xscale(alt) 951 | } 952 | else if "`ycat'"!="" { 953 | local yhor ylabel(, angle(0)) 954 | } 955 | if "`backfill'"!="" { 956 | if "`backfill_inner'"!="" { 957 | local backfill `"plotregion(icolor(`backfill_color'))"' 958 | } 959 | else { 960 | local backfill `"plotregion(color(`backfill_color') icolor(`backfill_color'))"' 961 | } 962 | } 963 | if "`graph'"=="" { 964 | if "`ramp'"!="" { 965 | tempname maingraph legendgraph 966 | local options `ramp_mopts' name(`maingraph') `options' 967 | } 968 | else { 969 | local options `legendopt' `options' 970 | } 971 | graph twoway `plots', ytitle(`"`ytitle'"') xtitle(`"`xtitle'"')/* 972 | */ `yscale' `yhor' `ylabel' `xscale' `xlabel' `backfill'/* 973 | */ `byopt' `options' 974 | if "`ramp'"!="" { 975 | // generate color ramp 976 | qui gen long `ramp_ID' = . 977 | qui gen double `ramp_Y' = . 978 | qui gen double `ramp_X' = . 979 | mata: generatescalecoords(st_matrix("`CUTS'"), /// 980 | ("`ramp_ID'", "`ramp_Y'", "`ramp_X'")) 981 | if `"`retransform'"'!="" { 982 | local tmptransform: subinstr local retransform "@" "`ramp_Y'", all 983 | qui replace `ramp_Y' = `tmptransform' if `ramp_Y'<. 984 | } 985 | local plots 986 | local clist 987 | forv i = 1/`levels' { 988 | gettoken c clist : clist 989 | if `"`c'"'=="" { // recycle 990 | gettoken c clist : colors 991 | } 992 | local plots `plots' (area `ramp_YX' if `ramp_ID'==`i',/* 993 | */ nodropbase cmissing(n) color("`c'") finten(100) `p' `p`i'') 994 | } 995 | local ymin = `CUTS'[1,1] 996 | local ymax = `CUTS'[1,`levels'+1] 997 | if `"`retransform'"'!="" { 998 | local tmptransform: subinstr local retransform "@" "`ymin'", all 999 | local ymin = `tmptransform' 1000 | local tmptransform: subinstr local retransform "@" "`ymax'", all 1001 | local ymax = `tmptransform' 1002 | } 1003 | if `"`ramp_ylabel'"'=="" local ramp_ylabel `ymin' `ymax' 1004 | else { 1005 | local ramp_ylabel: subinstr local ramp_ylabel "@min" "`ymin'", all 1006 | local ramp_ylabel: subinstr local ramp_ylabel "@max" "`ymax'", all 1007 | } 1008 | graph twoway `plots', name(`legendgraph') `ramp_opts' 1009 | if "`ramp_N_preserve'"!="" { 1010 | // remove observation added by generatescalecoords() 1011 | qui keep in 1/`ramp_N_preserve' 1012 | } 1013 | // put graphs together 1014 | if "`ramp_pos'"=="right" local grcombine `maingraph' `legendgraph', rows(1) 1015 | else if "`ramp_pos'"=="left" local grcombine `legendgraph' `maingraph', rows(1) 1016 | else if "`ramp_pos'"=="top" local grcombine `legendgraph' `maingraph', cols(1) 1017 | else /*bottom*/ local grcombine `maingraph' `legendgraph', cols(1) 1018 | graph combine `grcombine' iscale(1) commonscheme `ramp_combopts' `ramp_gropts' 1019 | } 1020 | } 1021 | 1022 | // generate 1023 | if "`generate'"!="" { 1024 | if `"`addplot'"'!="" & "`addplotnopreserve'"=="" { 1025 | if "`preserve'"!="" { // get rid of orig data 1026 | keep `z' `Z' `y' `Y' `x' `X' `z2' `z3' `xWDvar' `yWDvar' 1027 | keep if `Z'<. | `Z'==.z 1028 | } 1029 | else { 1030 | qui count 1031 | if (r(N)>`N0') { 1032 | di as txt "number of observations will be reset to " r(N) 1033 | di as txt "Press any key to continue, or Break to abort" 1034 | more 1035 | } 1036 | } 1037 | } 1038 | else if "`preserve'"=="" { 1039 | tempfile plotdata 1040 | if "`by'"!="" { 1041 | tempvar byindex sortindex 1042 | sort `by' 1043 | qui by `by': gen double `byindex' = _n 1044 | qui save `"`plotdata'"', replace 1045 | restore, preserve 1046 | qui gen double `sortindex' = _n 1047 | sort `by' 1048 | qui by `by': gen double `byindex' = _n 1049 | qui merge 1:1 `by' `byindex' using `"`plotdata'"', /// 1050 | keep(match master using) nogenerate 1051 | sort `sortindex' 1052 | drop `byindex' `sortindex' 1053 | } 1054 | else if !`no_dataset_in_use' { 1055 | qui save `"`plotdata'"', replace 1056 | restore, preserve 1057 | qui merge 1:1 _n using `"`plotdata'"', /// 1058 | keep(match master using) nogenerate 1059 | } 1060 | capt erase `"`plotdata'"' 1061 | qui count 1062 | if (r(N)>`N0') { 1063 | di as txt "number of observations will be reset to " r(N) 1064 | di as txt "Press any key to continue, or Break to abort" 1065 | more 1066 | } 1067 | } 1068 | lab var `z' "Z value" 1069 | lab var `Z' "Z id" 1070 | lab var `y' "Y value (midpoint)" 1071 | lab var `Y' "Y shape (coordinates)" 1072 | lab var `x' "X value (midpoint)" 1073 | lab var `X' "X shape (coordinates)" 1074 | if "`z2'"!="" { 1075 | lab var `z2' "shape scaling size" 1076 | } 1077 | if "`z3'"!="" { 1078 | lab var `z3' "marker labels" 1079 | } 1080 | foreach v0 in z Z y Y x X z2 z3 { 1081 | gettoken v vnames : vnames 1082 | if "``v0''"=="" continue 1083 | if "`replace'"!="" { 1084 | capt confirm new var `v', exact 1085 | if _rc drop `v' 1086 | } 1087 | rename ``v0'' `v' 1088 | local vdescribe `vdescribe' `v' 1089 | } 1090 | order `vdescribe', last 1091 | describe `vdescribe' 1092 | } 1093 | 1094 | // returns 1095 | foreach v in x y { 1096 | return scalar `v'_ub = ``v'_UB' 1097 | return scalar `v'_lb = ``v'_LB' 1098 | if "``v'cuts'"!="" local wd . 1099 | else local wd = ``v'_WD' 1100 | return scalar `v'_wd = `wd' 1101 | return scalar `v'_k = ``v'_K' 1102 | } 1103 | return local eqcoords `"`eqcoords'"' 1104 | return local keylabels `"`legend'"' 1105 | return local colors `"`colors'"' 1106 | return scalar levels = `levels' 1107 | return local xtitle `"`xtitle'"' 1108 | return local ytitle `"`ytitle'"' 1109 | return local ztitle `"`ztitle'"' 1110 | return matrix cuts = `CUTS' 1111 | return scalar N = `N' 1112 | 1113 | // skip restore if appropriate 1114 | if "`generate'"!="" restore, not 1115 | end 1116 | 1117 | program _parse_mata 1118 | syntax, mata(name) 1119 | c_local mata `mata' 1120 | end 1121 | 1122 | program _parse_bins 1123 | gettoken x 0 : 0 1124 | gettoken iswd 0 : 0 1125 | gettoken K 0 : 0 1126 | gettoken LB 0 : 0 1127 | gettoken UB 0 : 0 1128 | gettoken WD 0 : 0 1129 | _parse comma bins 0 : 0 1130 | syntax [, tight ltight rtight ] 1131 | if "`ltight'"!="" & "`rtight'"!="" local tight tight 1132 | else if "`tight'"=="" local tight `ltight' `rtight' 1133 | numlist `"`bins'"', min(0) max(4) missingokay 1134 | local bins `r(numlist)' 1135 | if `iswd' gettoken wd lb : bins 1136 | else gettoken k lb : bins 1137 | gettoken lb ub : lb 1138 | gettoken ub rest : ub 1139 | if `"`rest'"'!="" { 1140 | di as err `"`rest' not allowed"' 1141 | exit 198 1142 | } 1143 | if "`k'"=="" local k . 1144 | if "`k'"!="." { 1145 | capt numlist "`k'", integer max(1) range(>0) 1146 | if _rc { 1147 | di as err "`k' not allowed; number of bins must be a positive integer" 1148 | exit 198 1149 | } 1150 | } 1151 | if "`wd'"=="" local wd . 1152 | if "`wd'"!="." { 1153 | if `wd'<=0 { 1154 | di as err "`wd' not allowed; bin width must be positive" 1155 | exit 198 1156 | } 1157 | } 1158 | if "`lb'"=="" local lb . 1159 | if "`ub'"=="" local ub . 1160 | scalar `K' = `k' 1161 | scalar `LB' = `lb' 1162 | scalar `UB' = `ub' 1163 | scalar `WD' = `wd' 1164 | c_local `x'tight `tight' 1165 | end 1166 | 1167 | program _parse_bcuts 1168 | args x K LB UB WD cuts 1169 | local n: list sizeof cuts 1170 | local lb: word 1 of `cuts' 1171 | local ub: word `n' of `cuts' 1172 | scalar `K' = `n' - 1 1173 | scalar `LB' = `lb' 1174 | scalar `UB' = `ub' 1175 | scalar `WD' = . 1176 | c_local `x'cuts "`cuts'" 1177 | end 1178 | 1179 | program _parse_hex 1180 | syntax [, VERTical HORizontal odd even left right ] 1181 | local dir `vertical' `horizontal' 1182 | if `: list sizeof dir'>1 { 1183 | di as err "only one of vertical and horizontal allowed" 1184 | exit 198 1185 | } 1186 | local order `right' `left' 1187 | if `: list sizeof order'>1 { 1188 | di as err "only one of left and right allowed" 1189 | exit 198 1190 | } 1191 | local odd `even' `odd' 1192 | if `: list sizeof odd'>1 { 1193 | di as err "only one of odd and even allowed" 1194 | exit 198 1195 | } 1196 | c_local hexdir = ("`dir'"=="horizontal") 1197 | c_local hexorder = ("`order'"=="left") 1198 | c_local hexodd = ("`odd'"=="odd") 1199 | end 1200 | 1201 | program _parse_bylegend 1202 | syntax [, legend(str asis) ramp ] 1203 | if "`ramp'"!="" { 1204 | if `"`legend'"'=="" local legend legend(off) 1205 | c_local bylegend `legend' 1206 | exit 1207 | } 1208 | local 0 `", `legend'"' 1209 | syntax [, POSition(passthru) * ] 1210 | if `"`position'"'=="" local position position(3) 1211 | c_local bylegend legend(`position' `options') 1212 | end 1213 | 1214 | program _parse_popts 1215 | local opts `0' 1216 | while (`"`opts'"'!="") { 1217 | gettoken opt opts : opts, bind 1218 | gettoken next : opts, bind 1219 | if substr(`"`next'"', 1, 1)=="(" { 1220 | gettoken next opts : opts, bind 1221 | local opt `opt'`next' 1222 | } 1223 | if substr(`"`opt'"', 1, 1)=="p" { 1224 | if regexm(`"`opt'"', "^p([0-9]+)") { 1225 | local num = regexs(1) 1226 | local 0 `", `opt'"' 1227 | capt syntax, P`num'(str) 1228 | if _rc==0 { 1229 | c_local p`num' `"`p`num''"' 1230 | continue 1231 | } 1232 | } 1233 | } 1234 | local graphopts `graphopts' `opt' 1235 | } 1236 | c_local graphopts `graphopts' 1237 | end 1238 | 1239 | program _check_gropts 1240 | // allow some additional options that do not seem to be covered by 1241 | // _get_gropts, gettwoway; this possibly has to be updated for 1242 | // future Stata versions 1243 | syntax [, /// 1244 | LEGend(passthru) /// 1245 | play(passthru) /// 1246 | PCYCle(passthru) /// 1247 | YVARLabel(passthru) /// 1248 | XVARLabel(passthru) /// 1249 | YVARFormat(passthru) /// 1250 | XVARFormat(passthru) /// 1251 | YOVERHANGs /// 1252 | XOVERHANGs /// 1253 | /// recast(passthru) /// 1254 | fxsize(passthru) /// 1255 | fysize(passthru) /// 1256 | ] 1257 | end 1258 | 1259 | program _parse_values 1260 | syntax [, Label(str asis) Format(str) TRANSform(str asis) /// 1261 | STYle(passthru) Position(passthru) Gap(passthru) ANGle(passthru) /// 1262 | Size(passthru) Color(passthru) /// 1263 | /// backward compatibility: 1264 | MLABSTYle(str) MLABPosition(str) MLABGap(str) MLABANGle(str) /// 1265 | MLABSize(str) MLABColor(str) MLABTextstyle(str) /// 1266 | ] 1267 | _parse_expstat `label' // returns exp and statistic 1268 | if `"`format'"'!="" { 1269 | confirm format `format' 1270 | } 1271 | foreach opt in style position gap angle size color textstyle { 1272 | // backward compatibility 1273 | if `"`mlab`opt''"'!="" { 1274 | if `"``opt''"'=="" { 1275 | local `opt' `opt'(`mlab`opt'') 1276 | } 1277 | } 1278 | } 1279 | if `"`position'"'=="" local position position(0) 1280 | if `"`style'`color'"'=="" local color color(black) 1281 | local values2 ms(i) mlab`position' 1282 | foreach opt in style gap angle size color textstyle { 1283 | if `"``opt''"'!="" { 1284 | local values2 `values2' mlab``opt'' 1285 | } 1286 | } 1287 | c_local valuesexp `"`exp'"' 1288 | c_local valuestat `"`statistic'"' 1289 | c_local valuestrans `"`transform'"' 1290 | c_local valuesfmt `format' 1291 | c_local values2 `values2' 1292 | end 1293 | 1294 | program _parse_size2 1295 | _parse_expstat `0' // returns exp and statistic 1296 | if `"`statistic'"'=="" local statistic mean 1297 | c_local size2 `"`exp'"' 1298 | c_local sizestat `"`statistic'"' 1299 | end 1300 | 1301 | program _parse_expstat 1302 | _parse comma EXP 0 : 0 1303 | syntax [, Statistic(name) ] 1304 | c_local exp `"`EXP'"' 1305 | c_local statistic "`statistic'" 1306 | end 1307 | 1308 | program _parse_backfill 1309 | syntax [, Inner Last ] 1310 | c_local backfill_inner `inner' 1311 | c_local backfill_last `last' 1312 | end 1313 | 1314 | program _parse_missing 1315 | syntax [, Label(str asis) * ] 1316 | if `"`label'"'=="" local label `""missing""' 1317 | else { 1318 | gettoken trash: label, qed(qed) 1319 | if `qed'==0 { 1320 | // if first token unquoted: pack complete string into quotes 1321 | local label `"`"`label'"'"' 1322 | } 1323 | else { 1324 | // if first token quoted: pack all tokens into quotes 1325 | mata: st_local("label", quotetokens(st_local("label"))) 1326 | } 1327 | } 1328 | c_local missing_label `"`label'"' 1329 | c_local missing2 `"`options'"' 1330 | end 1331 | 1332 | program _check_cuts 1333 | _parse comma CUTS 0 : 0 1334 | if strpos(`"`0'"', "@min") exit // do parsing later, when min is known 1335 | if strpos(`"`0'"', "@max") exit // do parsing later, when max is known 1336 | if strpos(`"`0'"', "{") __parse_cuts `0' 1337 | else syntax [, cuts(numlist ascending) ] 1338 | local levels: list sizeof cuts 1339 | matrix `CUTS' = J(1,`levels',.) 1340 | forv i = 1/`levels' { 1341 | gettoken cut cuts : cuts 1342 | matrix `CUTS'[1,`i'] = `cut' 1343 | } 1344 | c_local cuts 1345 | c_local levels = `levels' - 1 1346 | end 1347 | 1348 | program _parse_cuts 1349 | args CUTS MIN MAX cuts 1350 | local min = `MIN' 1351 | local max = `MAX' 1352 | local cuts: subinstr local cuts "@min" "`min'", all 1353 | local cuts: subinstr local cuts "@max" "`max'", all 1354 | __parse_cuts, cuts(`cuts') 1355 | local levels: list sizeof cuts 1356 | matrix `CUTS' = J(1,`levels',.) 1357 | forv i = 1/`levels' { 1358 | gettoken cut cuts : cuts 1359 | if `cut'==`min' matrix `CUTS'[1,`i'] = `MIN' // preserve precision 1360 | else if `cut'==`max' matrix `CUTS'[1,`i'] = `MAX' // preserve precision 1361 | else matrix `CUTS'[1,`i'] = `cut' 1362 | } 1363 | c_local cuts 1364 | c_local levels = `levels' - 1 1365 | end 1366 | 1367 | program __parse_cuts 1368 | syntax [, cuts(str asis) ] 1369 | if strpos(`"`cuts'"', "{") { 1370 | local rest `"`cuts'"' 1371 | local cuts 1372 | while (`"`rest'"'!="") { 1373 | gettoken t rest : rest, parse("{") 1374 | if `"`t'"'=="{" { 1375 | gettoken t rest : rest, parse("}") 1376 | if `"`rest'"'=="" { // closing brace not found 1377 | local cuts `"`0'"' 1378 | continue, break 1379 | } 1380 | local t = `t' 1381 | local cuts `"`cuts'`t'"' 1382 | gettoken t rest : rest, parse("}") 1383 | continue 1384 | } 1385 | local cuts `"`cuts'`t'"' 1386 | } 1387 | } 1388 | local 0 `", cuts(`cuts')"' 1389 | syntax [, cuts(numlist ascending) ] 1390 | c_local cuts `cuts' 1391 | end 1392 | 1393 | program _parse_keylab 1394 | _parse comma keylab 0 : 0 1395 | syntax [, /// 1396 | Format(str) TRANSform(str asis) INTERval RANge(numlist max=1) area /// 1397 | all order(passthru) POSition(passthru) Cols(passthru) Rows(passthru) /// 1398 | ROWGap(passthru) KEYGap(passthru) SYMXsize(passthru) /// 1399 | TSTYle(passthru) SIze(passthru) * ] 1400 | local cols `rows' `cols' 1401 | local size `tstyle' `size' 1402 | if "`interval'"!="" & "`range'"!="" { 1403 | di as err "interval and range() are not both allowed" 1404 | exit 198 1405 | } 1406 | if `"`keylab'"'=="all" c_local keylabels "all" 1407 | else if `"`keylab'"'=="none" c_local keylabels "none" 1408 | else if `"`keylab'"'=="minmax" c_local keylabels "minmax" 1409 | else if `"`keylab'"'!="" { 1410 | capt n numlist `"`keylab'"', ascending integer range(>0) 1411 | if _rc { 1412 | di as err `"`keylab' not allowed in keylabels()"' 1413 | exit _rc 1414 | } 1415 | c_local keylabels "`r(numlist)'" 1416 | } 1417 | else c_local keylabels 1418 | if `"`cols'"'=="" local cols cols(1) 1419 | if `"`rowgap'"'=="" local rowgap rowgap(0) 1420 | if `"`keygap'"'=="" local keygap keygap(tiny) 1421 | if `"`symxsize'"'=="" local symxsize symxsize(medlarge) 1422 | if `"`size'"'=="" local size size(vsmall) 1423 | if `"`format'"'!="" { 1424 | confirm numeric format `format' 1425 | } 1426 | else local format %7.0g 1427 | c_local keylab_format `"`format'"' 1428 | c_local keylab_interval `interval' 1429 | c_local keylab_range `range' 1430 | c_local keylab_area `area' 1431 | c_local keylab_order `order' 1432 | c_local keylab_pos `position' 1433 | c_local keylab_opts all `cols' `rowgap' `keygap' `symxsize' `size' `options' 1434 | c_local retransform `"`transform'"' 1435 | end 1436 | 1437 | program _parse_ramp 1438 | syntax anything [, Left Right Top Bottom /// 1439 | LABels(str asis) /// 1440 | Format(passthru) /// 1441 | Length(numlist max=1 >=0 <=100) /// 1442 | Space(numlist max=1 >=0 <=100) /// 1443 | TRANSform(str asis) /// 1444 | Combine(str asis) /// 1445 | fysize(passthru) fxsize(passthru) * ] 1446 | gettoken y anything : anything 1447 | gettoken x : anything 1448 | local pos `left' `right' `top' `bottom' 1449 | if `:list sizeof pos'>1 { 1450 | di as err "ramp(): only one of left, right, top, and bottom allowed" 1451 | exit 198 1452 | } 1453 | if "`pos'"=="" local pos bottom 1454 | if "`pos'"=="right" { 1455 | local Y y 1456 | local X x 1457 | local alt "alt " 1458 | local angle "angle(0) " 1459 | local margin "l=0 t=0 b=0" 1460 | local tiopt "span position(11) margin(l=0 r=0 t=0 b=2)" 1461 | if "`length'"=="" local length 60 1462 | if "`space'"=="" local space 20 1463 | } 1464 | else if "`pos'"=="left" { 1465 | local Y y 1466 | local X x 1467 | local angle "angle(0) " 1468 | local margin "r=0 t=0 b=0" 1469 | local tiopt "span position(12) margin(l=0 r=0 t=0 b=2)" 1470 | if "`length'"=="" local length 60 1471 | if "`space'"=="" local space 20 1472 | } 1473 | else if "`pos'"=="top" { 1474 | local Y x 1475 | local X y 1476 | local margin "l=0 r=0 b=0" 1477 | local tiopt "position(9) margin(l=0 r=1 t=0 b=0)" 1478 | if "`length'"=="" local length 80 1479 | if "`space'"=="" local space 12 1480 | } 1481 | else { 1482 | local Y x 1483 | local X y 1484 | local margin "l=0 r=0 t=0" 1485 | local tiopt "position(9) margin(l=0 r=1 t=0 b=0)" 1486 | if "`length'"=="" local length 80 1487 | if "`space'"=="" local space 12 1488 | } 1489 | if "`f`Y'size'"=="" local f`Y'size `length' 1490 | if "`f`X'size'"=="" local f`X'size `space' 1491 | _parse comma lhs rhs : labels 1492 | if `"`rhs'"'!="" { 1493 | gettoken comma rhs : rhs, parse(",") 1494 | local rhs `" `rhs'"' 1495 | } 1496 | if `"`format'"'=="" local format format(%7.0g) 1497 | c_local ramp_YX ``Y'' ``X'' 1498 | c_local ramp_pos `pos' 1499 | c_local ramp_ylabel `"`lhs'"' 1500 | c_local ramp_opts `X'scale(off) `X'label(, nogrid)/* 1501 | */ `Y'title("") `Y'scale(`alt'noextend)/* 1502 | */ `Y'label(\`ramp_ylabel', `format' `angle'nogrid`rhs')/* 1503 | */ subtitle(\`ztitle', size(medsmall) `tiopt')/* 1504 | */ legend(off) f`Y'size(`f`Y'size') f`X'size(`f`X'size')/* 1505 | */ plotregion(margin(zero) style(none)) graphr(margin(zero))/* 1506 | */ nodraw `options' 1507 | c_local ramp_mopts legend(off) nodraw graphregion(margin(`margin')) 1508 | c_local ramp_combopts `"`combine'"' 1509 | c_local retransform `"`transform'"' 1510 | end 1511 | 1512 | program _parse_ramp_gropts 1513 | syntax [, /// 1514 | TItle(passthru) SUBtitle(passthru) note(passthru) CAPtion(passthru) /// 1515 | YSIZe(passthru) XSIZe(passthru) nodraw scheme(passthru) /// 1516 | name(passthru) saving(passthru) * ] 1517 | c_local ramp_gropts `title' `subtitle' `note' `caption' `ysize' `xsize' `draw' `scheme' `name' `saving' 1518 | c_local options `options' 1519 | end 1520 | 1521 | program _parse_generate 1522 | syntax [namelist] [, noPReserve ] 1523 | c_local generate2 `namelist' 1524 | c_local nopreserve `preserve' 1525 | end 1526 | 1527 | program _makebin_categorical 1528 | gettoken fast 0 : 0 1529 | gettoken xy 0 : 0 1530 | if "`fast'"=="" _makebin_categorical_std `0' 1531 | else _makebin_categorical_fast `0' 1532 | c_local `xy'label `xy'label(`lbls') 1533 | end 1534 | 1535 | program _makebin_categorical_std, sortpreserve 1536 | args v x K LB MIN UB MAX label 1537 | sort `x' 1538 | qui by `x': gen double `v' = (`x'!=`x'[_n-1]) 1539 | mata: collectlbls("`x'", "`v'") // returns local lbls and sets scalar K 1540 | qui replace `v' = `v' + `v'[_n-1] in 2/l 1541 | scalar `LB' = 1 1542 | scalar `MIN' = 1 1543 | scalar `UB' = `K' 1544 | scalar `MAX' = `K' 1545 | c_local lbls `"`lbls'"' 1546 | end 1547 | 1548 | program _makebin_categorical_fast // requires gtoolss 1549 | args v x K LB MIN UB MAX label 1550 | tempvar tag 1551 | gegen long `v' = group(`x') 1552 | gegen byte `tag' = tag(`v') 1553 | mata: collectlbls("`x'", "`tag'", "`v'") // returns local lbls and sets scalar K 1554 | scalar `LB' = 1 1555 | scalar `MIN' = 1 1556 | scalar `UB' = `K' 1557 | scalar `MAX' = `K' 1558 | c_local lbls `"`lbls'"' 1559 | end 1560 | 1561 | program _makebin_discrete 1562 | args v x K LB MIN UB MAX WD 1563 | rename `x' `v' 1564 | su `v', meanonly 1565 | scalar `MIN' = r(min) 1566 | scalar `MAX' = r(max) 1567 | scalar `K' = round((`MAX'-`MIN')/`WD') + 1 1568 | scalar `LB' = `MIN' 1569 | scalar `UB' = `MAX' 1570 | end 1571 | 1572 | program _makebin_cuts 1573 | args xy v x K LB MIN UB MAX WD cuts 1574 | // setup 1575 | su `x', meanonly 1576 | scalar `MIN' = r(min) 1577 | scalar `MAX' = r(max) 1578 | // compute bin midpoints 1579 | qui gen double `WD' = . 1580 | qui gen double `v' = . 1581 | gettoken ul cuts : cuts 1582 | forv i = 1/`=`K'' { 1583 | local ll `ul' 1584 | gettoken ul cuts : cuts 1585 | local wd = `ul'-`ll' 1586 | if `i'==`K' local lt "<=" 1587 | else local lt "<" 1588 | qui replace `WD' = `ul'-`ll' if `x'>=`ll' & `x'`lt'`ul' 1589 | qui replace `v' = (`ul'+`ll')/2 if `x'>=`ll' & `x'`lt'`ul' 1590 | } 1591 | // note on omitted data 1592 | if `LB'>`MIN' | `UB'<`MAX' { 1593 | qui count if `v'>=. 1594 | if r(N) { 1595 | di as txt "(`r(N)' observations outside range of `xy'cuts)" 1596 | } 1597 | } 1598 | end 1599 | 1600 | program _makebin_continuous 1601 | args xy v x K LB MIN UB MAX WD tight clip wgt 1602 | // setup 1603 | su `x' `wgt', meanonly 1604 | local N = r(N) 1605 | scalar `MIN' = r(min) 1606 | scalar `MAX' = r(max) 1607 | if `LB'>=. scalar `LB' = `MIN' 1608 | if `UB'>=. scalar `UB' = `MAX' 1609 | // determine step width 1610 | local UBtight = 0 1611 | if `WD'>=. { 1612 | if `K'>=. { 1613 | scalar `K' = max(1, trunc(min(sqrt(`N'), 10*ln(`N')/ln(10)))) 1614 | if (`UB'-`LB')<(`MAX'-`MIN') { 1615 | // reduce k if range has been restricted 1616 | scalar `K' = ceil(`K' * (`UB'-`LB') / (`MAX'-`MIN')) 1617 | } 1618 | } 1619 | if `K'<2 { 1620 | if `UB'>`LB' & "`tight'"=="" { 1621 | scalar `K' = 2 // set minimum 1622 | di as txt "(number of `xy'bins reset to 2)" 1623 | } 1624 | } 1625 | if `UB'<=`LB' scalar `WD' = 1 1626 | else if "`tight'"=="tight" scalar `WD' = (`UB' - `LB') / `K' 1627 | else if "`tight'"=="ltight" scalar `WD' = (`UB' - `LB') / (`K'-.5) 1628 | else if "`tight'"=="rtight" scalar `WD' = (`UB' - `LB') / (`K'-.5) 1629 | else scalar `WD' = (`UB' - `LB') / (`K'-1) 1630 | if inlist("`tight'", "tight", "rtight") & `UB'>`LB' local UBtight = 1 1631 | } 1632 | else { 1633 | local lb st_numscalar("`LB'") 1634 | local ub st_numscalar("`UB'") 1635 | local wd st_numscalar("`WD'") 1636 | if "`tight'"=="tight" mata: countbins(`lb'+`wd'/2, `ub', `wd', `wd'/2, 1) 1637 | else if "`tight'"=="ltight" mata: countbins(`lb'+`wd'/2, `ub', `wd', `wd'/2, 0) 1638 | else if "`tight'"=="rtight" mata: countbins(`lb', `ub', `wd', `wd'/2, 1) 1639 | else mata: countbins(`lb', `ub', `wd', `wd'/2, 0) 1640 | if inlist("`tight'", "tight", "rtight") { 1641 | if "`tight'"=="tight" local UBtight = (abs((`LB' + `K'*`WD') - `UB') / (`WD'+1)) < 1e-12 1642 | else local UBtight = (abs((`LB' + (`K'-.5)*`WD') - `UB') / (`WD'+1)) < 1e-12 1643 | } 1644 | } 1645 | // compute bin midpoints 1646 | qui gen double `v' = floor((`x' - `LB') / `WD' * 2) 1647 | if inlist("`tight'", "tight", "ltight") { 1648 | if "`tight'"=="tight" & `UBtight' /// 1649 | qui replace `v' = floor((`v' - (mod(`v', 2)==0 & `x'==`UB'))/2) * 2 + 1 1650 | else qui replace `v' = floor(`v'/2) * 2 + 1 1651 | } 1652 | else { 1653 | if "`tight'"=="rtight" & `UBtight' /// 1654 | qui replace `v' = floor((`v'+1 - (mod(`v'+1, 2)==0 & `x'==`UB'))/2) * 2 1655 | else qui replace `v' = floor((`v'+1)/2) * 2 1656 | } 1657 | qui replace `v' = `LB' + `v'/2 * `WD' 1658 | // remove bins that are out of range 1659 | if `LB'>`MIN' qui replace `v' = . if `v' < `LB' 1660 | if `UB'<`MAX' { 1661 | if inlist("`tight'", "tight", "rtight") /// 1662 | qui replace `v' = . if `v' >= (`UB' + `WD'/2) 1663 | else /// 1664 | qui replace `v' = . if `v' > (`UB' + `WD'/2) 1665 | } 1666 | // clip: omit data that is out of range 1667 | if "`clip'"!="" { 1668 | if "`clip'"!="rclip" & `LB'>`MIN' qui replace `v' = . if `x' < `LB' 1669 | if "`clip'"!="lclip" & `UB'<`MAX' qui replace `v' = . if `x' > `UB' 1670 | } 1671 | // note on omitted data 1672 | if `LB'>`MIN' | `UB'<`MAX' { 1673 | qui count if `v'>=. 1674 | if r(N) { 1675 | di as txt "(`r(N)' observations outside range of `xy'bins)" 1676 | } 1677 | } 1678 | end 1679 | 1680 | program _hexbin_prepare 1681 | args xy v x K LB MIN UB MAX WD tight odd wgt 1682 | // setup 1683 | su `x' `wgt', meanonly 1684 | local N = r(N) 1685 | scalar `MIN' = r(min) 1686 | scalar `MAX' = r(max) 1687 | if `LB'>=. scalar `LB' = `MIN' 1688 | if `UB'>=. scalar `UB' = `MAX' 1689 | // determine step width 1690 | if `WD'>=. { 1691 | if `K'>=. { 1692 | scalar `K' = max(1, trunc(min(sqrt(`N'), 10*ln(`N')/ln(10)))) 1693 | if (`UB'-`LB')<(`MAX'-`MIN') { 1694 | // reduce k if range has been restricted 1695 | if "`xy'"=="y" scalar `K' = ceil(`K' * (`UB'-`LB') / (`MAX'-`MIN')) 1696 | else scalar `K' = ceil(2 * `K' * (`UB'-`LB') / (`MAX'-`MIN')) / 2 1697 | } 1698 | } 1699 | if `K'<2 { 1700 | if `UB'>`LB' & "`tight'"=="" { 1701 | scalar `K' = 2 // set minimum 1702 | di as txt "(number of `xy'bins reset to 2)" 1703 | } 1704 | } 1705 | if `UB'<=`LB' scalar `WD' = 1 1706 | else if "`xy'"=="y" { 1707 | if "`tight'"=="tight" scalar `WD' = (`UB' - `LB') / (`K'-1/3) 1708 | else if "`tight'"=="ltight" scalar `WD' = (`UB' - `LB') / (`K'-2/3) 1709 | else if "`tight'"=="rtight" scalar `WD' = (`UB' - `LB') / (`K'-2/3) 1710 | else scalar `WD' = (`UB' - `LB') / (`K'-1) 1711 | } 1712 | else { 1713 | if "`tight'"=="tight" scalar `WD' = (`UB' - `LB') / (`K'-.5 -`odd'*.5) 1714 | else if "`tight'"=="ltight" scalar `WD' = (`UB' - `LB') / (`K'-3/4-`odd'*.5) 1715 | else if "`tight'"=="rtight" scalar `WD' = (`UB' - `LB') / (`K'-3/4-`odd'*.5) 1716 | else scalar `WD' = (`UB' - `LB') / (`K'-1 -`odd'*.5) 1717 | } 1718 | } 1719 | else { 1720 | local lb st_numscalar("`LB'") 1721 | local ub st_numscalar("`UB'") 1722 | local wd st_numscalar("`WD'") 1723 | if "`xy'"=="y" { 1724 | if "`tight'"=="tight" mata: countbins(`lb' , `ub', `wd', `wd'/2, 1) 1725 | else if "`tight'"=="ltight" mata: countbins(`lb' , `ub', `wd', `wd'/2, 0) 1726 | else if "`tight'"=="rtight" mata: countbins(`lb'-`wd'/4, `ub', `wd', `wd'/2, 1) 1727 | else mata: countbins(`lb'-`wd'/4, `ub', `wd', `wd'/2, 0) 1728 | } 1729 | else { 1730 | if "`tight'"=="tight" mata: countbins(`lb'+`wd'/3, `ub', `wd', `wd'*2/3, 1, `odd') 1731 | else if "`tight'"=="ltight" mata: countbins(`lb'+`wd'/3, `ub', `wd', `wd'*2/3, 0, `odd') 1732 | else if "`tight'"=="rtight" mata: countbins(`lb' , `ub', `wd', `wd'*2/3, 1, `odd') 1733 | else mata: countbins(`lb' , `ub', `wd', `wd'*2/3, 0, `odd') 1734 | } 1735 | } 1736 | rename `x' `v' 1737 | end 1738 | 1739 | program _hexbin 1740 | args x x_LB x_UB x_WD x_MIN x_MAX xtight xclip xdisc /// 1741 | y y_LB y_UB y_WD y_MIN y_MAX ytight yclip ydisc order 1742 | // y 1743 | if "`ydisc'"!="" { 1744 | local y1 `y' 1745 | local y2 `y' 1746 | } 1747 | else { 1748 | tempvar y1 y2 1749 | qui gen double `y1' = floor((`y' - `y_LB') / `y_WD' * 3) 1750 | if inlist("`ytight'", "tight", "ltight") { 1751 | qui gen double `y2' = floor((`y1'-2)/6) * 6 + 4 1752 | qui replace `y1' = floor((`y1'+1)/6) * 6 + 1 1753 | } 1754 | else { 1755 | qui gen double `y2' = floor((`y1'-1)/6) * 6 + 3 1756 | qui replace `y1' = floor((`y1'+2)/6) * 6 1757 | } 1758 | if inlist("`ytight'", "tight", "ltight") { 1759 | // make sure that obs on lower edge are not put in bin below 1760 | qui replace `y2' = `y2' + 6 if `y2'<0 & `y'==`y_LB' 1761 | } 1762 | if inlist("`ytight'", "tight", "rtight") { 1763 | // make sure that obs on upper edge are not put in bin above 1764 | qui replace `y1' = `y1' - 6 if `y'==`y_UB' & /// 1765 | (abs((`y_LB' + ((`y1'-2)/3) * `y_WD') - `y_UB') / (`y_WD'+1)) < 1e-12 1766 | qui replace `y2' = `y2' - 6 if `y'==`y_UB' & /// 1767 | (abs((`y_LB' + ((`y2'-2)/3) * `y_WD') - `y_UB') / (`y_WD'+1)) < 1e-12 1768 | } 1769 | qui replace `y1' = `y_LB' + `y1'/3 * `y_WD' 1770 | qui replace `y2' = `y_LB' + `y2'/3 * `y_WD' 1771 | } 1772 | // x 1773 | if "`xdisc'"!="" { 1774 | local x1 `x' 1775 | local x2 `x' 1776 | } 1777 | else { 1778 | tempvar x1 x2 1779 | qui gen double `x1' = floor((`x' - `x_LB') / `x_WD' * 4) 1780 | if inlist("`xtight'", "tight", "ltight") { 1781 | qui gen double `x2' = floor((`x1'+2)/4) * 4 1782 | qui replace `x1' = floor(`x1'/4) * 4 + 2 1783 | } 1784 | else { 1785 | qui gen double `x2' = floor((`x1'+3)/4) * 4 - 1 1786 | qui replace `x1' = floor((`x1'+1)/4) * 4 + 1 1787 | } 1788 | if inlist("`xtight'", "tight", "rtight") { 1789 | // make sure that obs on right edge are not put in bin on the right 1790 | qui replace `x1' = `x1' - 4 if `x'==`x_UB' & /// 1791 | (abs((`x_LB' + ((`x1'-2)/4) * `x_WD') - `x_UB') / (`x_WD'+1)) < 1e-12 1792 | qui replace `x2' = `x2' - 4 if `x'==`x_UB' & /// 1793 | (abs((`x_LB' + ((`x2'-2)/4) * `x_WD') - `x_UB') / (`x_WD'+1)) < 1e-12 1794 | } 1795 | if `order' { 1796 | local tmp `x1' 1797 | local x1 `x2' 1798 | local x2 `tmp' 1799 | } 1800 | qui replace `x1' = `x_LB' + `x1'/4 * `x_WD' 1801 | qui replace `x2' = `x_LB' + `x2'/4 * `x_WD' 1802 | } 1803 | // pick position 1804 | if "`xdisc'"=="" | "`ydisc'"=="" { 1805 | tempvar d 1806 | qui gen byte `d' = (((`x'-`x1')/`x_WD')^2 + 3/4*((`y'-`y1')/`y_WD')^2) /// 1807 | < (((`x'-`x2')/`x_WD')^2 + 3/4*((`y'-`y2')/`y_WD')^2) 1808 | qui replace `x1' = `x2' if `d'==0 1809 | qui replace `y1' = `y2' if `d'==0 1810 | } 1811 | // remove bins that are out of range 1812 | if `y_LB'>`y_MIN' { 1813 | if inlist("`ytight'", "tight", "rtight") /// 1814 | qui replace `y1' = . if `y1' < `y_LB' 1815 | else /// 1816 | qui replace `y1' = . if (`y1'+ 2*`y_WD'/3) <= `y_LB' 1817 | } 1818 | if `y_UB'<`y_MAX' { 1819 | if inlist("`ytight'", "tight", "rtight") /// 1820 | qui replace `y1' = . if `y1' >= (`y_UB' + 2*`y_WD'/3) 1821 | else /// 1822 | qui replace `y1' = . if `y1' > (`y_UB' + 2*`y_WD'/3) 1823 | } 1824 | if `x_LB'>`x_MIN' { 1825 | if inlist("`xtight'", "tight", "ltight") /// 1826 | qui replace `x1' = . if `x1' < `x_LB' 1827 | else /// 1828 | qui replace `x1' = . if (`x1'+ `x_WD'/2) <= `x_LB' 1829 | } 1830 | if `x_UB'<`x_MAX' { 1831 | if inlist("`xtight'", "tight", "rtight") /// 1832 | qui replace `x1' = . if `x1' >= (`x_UB' + `x_WD'/2) 1833 | else /// 1834 | qui replace `x1' = . if `x1' > (`x_UB' + `x_WD'/2) 1835 | } 1836 | // clip: omit data that is out or range 1837 | if "`yclip'"!="" { 1838 | if "`yclip'"!="rclip" & `y_LB'>`y_MIN' qui replace `y1' = . if `y' < `y_LB' 1839 | if "`yclip'"!="lclip" & `y_UB'<`y_MAX' qui replace `y1' = . if `y' > `y_UB' 1840 | } 1841 | if "`xclip'"!="" { 1842 | if "`xclip'"!="rclip" & `x_LB'>`x_MIN' qui replace `x1' = . if `x' < `x_LB' 1843 | if "`xclip'"!="lclip" & `x_UB'<`x_MAX' qui replace `x1' = . if `x' > `x_UB' 1844 | } 1845 | if "`xdisc'"=="" { 1846 | drop `x' 1847 | rename `x1' `x' 1848 | } 1849 | if "`ydisc'"=="" { 1850 | drop `y' 1851 | rename `y1' `y' 1852 | } 1853 | // note on omitted data 1854 | if `x_LB'>`x_MIN' | `x_UB'<`x_MAX' { 1855 | qui count if `x'>=. 1856 | if r(N) { 1857 | di as txt "(`r(N)' observations outside range of xbins)" 1858 | } 1859 | } 1860 | if `y_LB'>`y_MIN' | `y_UB'<`y_MAX' { 1861 | qui count if `y'>=. 1862 | if r(N) { 1863 | di as txt "(`r(N)' observations outside range of ybins)" 1864 | } 1865 | } 1866 | end 1867 | 1868 | program _fillin 1869 | args fillin x x_K x_LB x_WD xtight xcat xcuts y y_K y_LB y_WD ytight ycat ycuts /// 1870 | z z2 by hexagon hexorder hexodd 1871 | if "`fillin'"=="" exit 1872 | tempname byindex 1873 | if "`by'"!="" { 1874 | sort `by' 1875 | qui by `by': gen double `byindex' = (_n==1) 1876 | qui replace `byindex' = `byindex'[_n-1] + `byindex' in 2/l 1877 | } 1878 | else qui gen byte `byindex' = 1 1879 | if "`hexagon'"!="" { 1880 | tempname xold 1881 | rename `x' `xold' 1882 | if inlist("`xtight'", "tight", "ltight") local xoff .25 1883 | else local xoff 0 1884 | qui gen double `x' = `x_LB' + /// 1885 | (round((`xold' - `x_LB') / `x_WD' - `xoff') + `xoff') * `x_WD' 1886 | } 1887 | tempfile plotdata 1888 | qui save `"`plotdata'"', replace 1889 | local keepvars `byindex' `x' `y' 1890 | if "`xcuts'"!="" local keepvars `keepvars' `x_WD' 1891 | if "`ycuts'"!="" local keepvars `keepvars' `y_WD' 1892 | keep `keepvars' 1893 | mata: fillingaps() 1894 | tempvar merge 1895 | qui merge 1:1 `byindex' `x' `y' using `"`plotdata'"', /// 1896 | keep(match master using) generate(`merge') 1897 | // capt assert (`merge'==1 | `merge'==3) 1898 | // if _rc { 1899 | // di as err "unexpected error; fillin algorithm returned inconsistent results" 1900 | // exit 499 1901 | // } 1902 | if "`hexagon'"!="" { 1903 | if inlist("`ytight'", "tight", "ltight") { 1904 | if `hexorder'==0 local yoff 1/3 1905 | else local yoff 4/3 1906 | } 1907 | else if `hexorder'==0 local yoff 0 1908 | else local yoff 1 1909 | qui replace `x' = `x' + (`x_WD'/4) * /// 1910 | cond(mod(round((`y' - `y_LB') / `y_WD' - `yoff'), 2), -1, 1) /// 1911 | if `xold'>=. 1912 | if `hexodd' { // remove last half-column 1913 | qui replace `x' = . if `xold'>=. /// 1914 | & ((`x' - `x_LB') / `x_WD' - `xoff') > (`x_K'-1) 1915 | qui keep if `x'<. 1916 | } 1917 | qui replace `xold' = `x' if `xold'>=. 1918 | drop `x' 1919 | rename `xold' `x' 1920 | } 1921 | qui replace `z' = `:word 1 of `fillin'' if `merge'==1 1922 | if "`z2'"!="" { 1923 | if `"`:word 2 of `fillin''"'!="" { 1924 | qui replace `z2' = `:word 2 of `fillin'' if `merge'==1 1925 | } 1926 | else { 1927 | qui replace `z2' = 1 if `merge'==1 1928 | } 1929 | } 1930 | if "`by'"!="" { 1931 | // fillin by variables 1932 | sort `byindex' `by' 1933 | foreach v of local by { 1934 | by `byindex': qui replace `v' = `v'[1] 1935 | } 1936 | } 1937 | end 1938 | 1939 | program _colorpalette 1940 | gettoken N 0 : 0 1941 | _parse comma p 0 : 0 1942 | syntax [, n(passthru) IPolate(passthru) * ] 1943 | if c(stata_version)<14.2 { 1944 | if `N'>0 { 1945 | if `"`n'"'=="" local n n(`N') 1946 | if `"`ipolate'"'=="" local ipolate ipolate(`N') 1947 | } 1948 | colorpalette9 `p', nograph `n' `ipolate' `options' 1949 | exit 1950 | } 1951 | if `N'>0 { 1952 | if `"`n'`ipolate'"'=="" local n n(`N') 1953 | } 1954 | colorpalette `p', nograph `n' `ipolate' `options' 1955 | end 1956 | 1957 | program _symbolpalette 1958 | _parse comma p 0 : 0 1959 | syntax [, * ] 1960 | symbolpalette `p', nograph `options' 1961 | end 1962 | 1963 | version 9.2 1964 | mata: 1965 | mata set matastrict on 1966 | 1967 | void writematamatrixtodata(transmorphic M, transmorphic S, transmorphic V) 1968 | { 1969 | if (!isreal(M) | !isreal(S) | !isreal(V)) { 1970 | display("{err}matrix must be real") 1971 | exit(3253) 1972 | } 1973 | _writematamatrixtodata(M, S, V) 1974 | } 1975 | 1976 | void writematrixtodata(string scalar m, string scalar s, string scalar v) 1977 | { 1978 | real matrix M, S, V 1979 | 1980 | // write matrix 1981 | M = st_matrix(m) 1982 | if (s!="") S = st_matrix(s) 1983 | if (v!="") V = st_matrix(v) 1984 | _writematamatrixtodata(M, S, V) 1985 | 1986 | // generate labels 1987 | if (st_local("equations")=="") { 1988 | // standard labels 1989 | writematrixlbls(st_matrixrowstripe(st_local("matrix")), "y") 1990 | writematrixlbls(st_matrixcolstripe(st_local("matrix")), "x") 1991 | } 1992 | else { 1993 | // label equations and compile outline coordinates 1994 | writematrixeqs(st_matrixrowstripe(st_local("matrix")), 1995 | st_matrixcolstripe(st_local("matrix"))) 1996 | } 1997 | } 1998 | 1999 | void _writematamatrixtodata(real matrix M, real matrix S, real matrix V) 2000 | { 2001 | real scalar i, j, k, r, c, z, y, x, hasdrop, N, upper, lower, nodiag, 2002 | xmin, xmax, ymin, ymax, d 2003 | real scalar hasS, zS, hasV, zV 2004 | real rowvector drop 2005 | 2006 | hasS = (length(S)!=0) 2007 | hasV = (length(V)!=0) 2008 | z = st_varindex(st_local("z0")) 2009 | y = st_varindex(st_local("y0")) 2010 | x = st_varindex(st_local("x0")) 2011 | drop = strtoreal(tokens(st_local("drop"))) 2012 | upper = st_local("upper")!="" 2013 | lower = st_local("lower")!="" 2014 | nodiag = st_local("diagonal")!="" 2015 | hasdrop = (length(drop)>0) 2016 | r = rows(M); c = cols(M); d = min((r,c)); k = r*c 2017 | if (hasS) { 2018 | if (rows(S)!=r | cols(S)!=c) { 2019 | display("{err}matrix specified in {bf:size()} must have" + 2020 | " same dimension as main matrix") 2021 | exit(3200) 2022 | } 2023 | zS = st_varindex(st_local("z2")) 2024 | } 2025 | if (hasV) { 2026 | if (rows(V)!=r | cols(V)!=c) { 2027 | display("{err}matrix specified in {bf:values(label())} must have" + 2028 | " same dimension as main matrix") 2029 | exit(3200) 2030 | } 2031 | zV = st_varindex(st_local("z3")) 2032 | } 2033 | if (nodiag) k = k - d 2034 | if (lower) k = k - (d*d-d)/2 - (c>r ? (c-r)*r : 0) 2035 | else if (upper) k = k - (d*d-d)/2 - (r>c ? (r-c)*c : 0) 2036 | N = st_nobs() 2037 | if (N < k) st_addobs(k - N) 2038 | if (!(hasdrop+lower+upper+nodiag)) { 2039 | // write full matrix (faster than the general code below) 2040 | k = 0 2041 | for (j=1; j<=c; j++) { 2042 | i = k + 1; k = j * r 2043 | st_store((i,k), z, M[,j]) 2044 | st_store((i,k), y, 1::r) 2045 | st_store((i,k), x, J(r,1,j)) 2046 | if (hasS) st_store((i,k), zS, S[,j]) 2047 | if (hasV) st_store((i,k), zV, V[,j]) 2048 | } 2049 | } 2050 | else { 2051 | // write partial matrix (general element-by-element code; speed could 2052 | // be improved by writing custom code for different situations) 2053 | k = 0 2054 | for (i=1; i<=r; i++) { 2055 | for (j=(upper ? i : 1); j<=(lower ? min((i,c)) : c); j++) { 2056 | if (nodiag) { 2057 | if (i==j) continue 2058 | } 2059 | if (hasdrop) { 2060 | if (anyof(drop, M[i,j])) continue 2061 | } 2062 | k++ 2063 | _st_store(k, z, M[i,j]) 2064 | _st_store(k, y, i) 2065 | _st_store(k, x, j) 2066 | if (hasS) _st_store(k, zS, S[i,j]) 2067 | if (hasV) _st_store(k, zV, V[i,j]) 2068 | } 2069 | } 2070 | if (k < st_nobs() & k>N) { // possible if hasdrop 2071 | stata("qui keep in 1/" + strofreal(k)) 2072 | } 2073 | } 2074 | if (st_local("syntax")=="3" | st_local("ydiscrete")!="") { 2075 | ymin = 1 + (lower & nodiag); ymax = (upper ? min((r,c-nodiag)) : r) 2076 | st_numscalar(st_local("y_K"), ymax-ymin+1) 2077 | st_numscalar(st_local("y_LB"), ymin) 2078 | st_numscalar(st_local("y_UB"), ymax) 2079 | st_numscalar(st_local("y_MIN"), ymin) 2080 | st_numscalar(st_local("y_MAX"), ymax) 2081 | st_numscalar(st_local("y_WD"), 1) 2082 | } 2083 | if (st_local("syntax")=="3" | st_local("xdiscrete")!="") { 2084 | xmin = 1 + (upper & nodiag); xmax = (lower ? min((c,r-nodiag)) : c) 2085 | st_numscalar(st_local("x_K"), xmax-xmin+1) 2086 | st_numscalar(st_local("x_LB"), xmin) 2087 | st_numscalar(st_local("x_UB"), xmax) 2088 | st_numscalar(st_local("x_MIN"), xmin) 2089 | st_numscalar(st_local("x_MAX"), xmax) 2090 | st_numscalar(st_local("x_WD"), 1) 2091 | } 2092 | } 2093 | 2094 | void writematrixlbls(string matrix stripe, string scalar x) 2095 | { 2096 | real scalar r, i, eq, label 2097 | string scalar lbl, lbls 2098 | pragma unset lbls 2099 | 2100 | label = st_local("label")!="" 2101 | eq = any(stripe[,1]:!=stripe[1,1]) 2102 | i = st_numscalar(st_local(x+"_MIN")) 2103 | r = st_numscalar(st_local(x+"_MAX")) 2104 | for (; i<=r; i++) { 2105 | lbl = stripe[i,2] 2106 | if (label) { 2107 | if (_st_varindex(lbl)<.) { 2108 | lbl = st_varlabel(lbl) 2109 | if (lbl=="") lbl = stripe[i,2] 2110 | } 2111 | } 2112 | if (eq) lbl = "`" + `"""' + stripe[i,1] + `":""' + "' " + 2113 | "`" + `"""' + lbl + `"""' + "'" 2114 | lbls = lbls + (i>1 ? " " : "") + strofreal(i) + " " + 2115 | "`" + `"""' + lbl + `"""' + "'" 2116 | } 2117 | st_local(x+"label", x+"label(" + lbls + ")") 2118 | } 2119 | 2120 | void writematrixeqs(string matrix R, string matrix C) 2121 | { 2122 | string colvector req, ceq 2123 | real matrix rlu, clu 2124 | pragma unset req 2125 | pragma unset ceq 2126 | pragma unset rlu 2127 | pragma unset clu 2128 | 2129 | geteqinfo(req, rlu, R) 2130 | writeeqlbls(req, rlu, "y") 2131 | geteqinfo(ceq, clu, C) 2132 | writeeqlbls(ceq, clu, "x") 2133 | writeeqcoords(req, rlu, ceq, clu) 2134 | } 2135 | 2136 | void geteqinfo(string colvector eq, real matrix lu, string matrix S) 2137 | { 2138 | real scalar i, j, r 2139 | string scalar s 2140 | 2141 | r = rows(S) 2142 | eq = J(r, 1, "") 2143 | lu = J(r, 2, .) 2144 | j = 0 2145 | for (i=1; i<=r; i++) { 2146 | j++ 2147 | s = S[i,1]; eq[j] = s; lu[j,1] = i 2148 | for (; i<=r; i++) { 2149 | if (i1 ? " " : "") + strofreal((lu[i,2]+lu[i,1])/2) + 2178 | " " + "`" + `"""' + lbl + `"""' + "'" 2179 | ticks = ticks + " " + strofreal(lu[i,2]+.5) 2180 | } 2181 | st_local(x+"label", x+"label(" + lbls + ", notick) " + 2182 | x+"tick(" + ticks + ")") 2183 | } 2184 | 2185 | void writeeqcoords(string colvector req, real matrix rlu, 2186 | string colvector ceq, real matrix clu) 2187 | { 2188 | real scalar i, n 2189 | string scalar coord, rlo, rup, clo, cup 2190 | pragma unset coord 2191 | 2192 | n = min((rows(req), rows(ceq))) 2193 | for (i=1; i<=n; i++) { 2194 | rlo = strofreal(rlu[i,1]-.5); rup = strofreal(rlu[i,2]+.5) 2195 | clo = strofreal(clu[i,1]-.5); cup = strofreal(clu[i,2]+.5) 2196 | coord = coord + (i>1 ? " " : "") + rlo + " " + clo + 2197 | " " + rlo + " " + cup + 2198 | " " + rup + " " + cup + 2199 | " " + rup + " " + clo + 2200 | " " + "." + " " + "." 2201 | } 2202 | st_local("eqcoords", coord) 2203 | } 2204 | 2205 | void countbins( 2206 | real scalar x0, // midpoint of first bin 2207 | real scalar ub, // upper bound 2208 | real scalar wd, // step width 2209 | real scalar h, // halfwidth of bin 2210 | real scalar r, // right inclusive 2211 | | real scalar odd) // hex: odd 2212 | { 2213 | real scalar k, x 2214 | 2215 | k = 1 2216 | x = x0 2217 | while (1) { 2218 | if (odd==1) x = x0 + k/2*wd 2219 | else x = x0 + k*wd 2220 | if (r) { 2221 | if (x>=(ub+h)) break 2222 | } 2223 | else { 2224 | if (x>(ub+h)) break 2225 | } 2226 | k++ 2227 | } 2228 | if (odd==1) k = ceil((k+1)/2) 2229 | st_numscalar(st_local("K"), k) 2230 | } 2231 | 2232 | void collectlbls(string scalar X, string scalar tag, | string scalar ID) 2233 | { 2234 | real scalar i, str, k 2235 | string scalar lbls, lab, vlab 2236 | transmorphic colvector x 2237 | real colvector id 2238 | pragma unset lbls 2239 | 2240 | str = st_isstrvar(X) 2241 | if (str) { 2242 | x = st_sdata(., X, tag) 2243 | } 2244 | else { 2245 | x = st_data(., X, tag) 2246 | vlab = st_varvaluelabel(X) 2247 | } 2248 | k = rows(x) 2249 | if (ID=="") id = 1::k 2250 | else id = st_data(., ID, tag) 2251 | for (i=1; i<=k; i++) { 2252 | if (str) lab = x[i] 2253 | else if (vlab!="") { 2254 | lab = st_vlmap(vlab, x[i]) 2255 | if (lab=="") lab = strofreal(x[i]) 2256 | } 2257 | else lab = strofreal(x[i]) 2258 | lbls = lbls + (i>1 ? " " : "") + strofreal(id[i]) + " " + 2259 | "`" + `"""' + lab + `"""' + "'" 2260 | } 2261 | st_local("lbls", lbls) 2262 | st_numscalar(st_local("K"), k) 2263 | } 2264 | 2265 | void fillingaps() 2266 | { 2267 | real scalar r, rby, ryx, ry, rx, i, j, a, b, aa, bb 2268 | real scalar hasywd, hasxwd 2269 | real colvector by, bynew, ynew, xnew, ywd, xwd 2270 | real matrix y, x 2271 | 2272 | // input 2273 | by = uniqrows(st_data(., st_local("byindex"))) 2274 | y = _fillingaps("y") 2275 | x = _fillingaps("x") 2276 | hasywd = cols(y)>1 2277 | hasxwd = cols(x)>1 2278 | 2279 | // expand 2280 | rby = rows(by); ry = rows(y); rx = rows(x) 2281 | ryx = ry * rx 2282 | r = rby * ryx 2283 | bynew = ynew = xnew = J(r, 1, .) 2284 | if (hasywd) ywd = J(r, 1, .) 2285 | if (hasxwd) xwd = J(r, 1, .) 2286 | for (i=1; i<=rby; i++) { 2287 | a = 1 + (i-1) * ryx 2288 | b = a + ryx - 1 2289 | bynew[|a \ b|] = J(ryx, 1, by[i]) 2290 | for (j=1; j<=ry; j++) { 2291 | aa = a + (j-1) * rx 2292 | bb = aa + rx - 1 2293 | ynew[|aa \ bb|] = J(rx, 1, y[j,1]) 2294 | xnew[|aa \ bb|] = x[,1] 2295 | if (hasywd) ywd[|aa \ bb|] = J(rx, 1, y[j,2]) 2296 | if (hasxwd) xwd[|aa \ bb|] = x[,2] 2297 | } 2298 | } 2299 | 2300 | // put back 2301 | if (st_nobs()<(r)) st_addobs(r-st_nobs()) 2302 | st_store(., st_local("byindex"), bynew) 2303 | st_store(., st_local("y"), ynew) 2304 | st_store(., st_local("x"), xnew) 2305 | if (hasywd) st_store(., st_local("y_WD"), ywd) 2306 | if (hasxwd) st_store(., st_local("x_WD"), xwd) 2307 | } 2308 | 2309 | real matrix _fillingaps(string scalar s) 2310 | { 2311 | real scalar min, wd, r, rnew, i, j, xi, ll 2312 | real colvector x 2313 | real matrix xnew 2314 | 2315 | if (st_local(s+"cuts")!="") { 2316 | x = strtoreal(tokens(st_local(s+"cuts")))' 2317 | rnew = length(x) - 1 2318 | xnew = (x[|1\rnew|] + x[|2\.|])/2, (x[|2\.|] - x[|1\rnew|]) 2319 | // copy original values to avoid precision issues: 2320 | x = uniqrows(st_data(., st_local(s))) 2321 | r = rows(x) 2322 | j = 1 2323 | for (i=1;i<=r;i++) { 2324 | xi = x[i] 2325 | while (xi>(xnew[j,1]+xnew[j,2]/4)) { // add 1/4 of width 2326 | if (j==rnew) break // not really needed 2327 | j++ 2328 | } 2329 | xnew[j,1] = xi 2330 | } 2331 | return(xnew) 2332 | } 2333 | x = uniqrows(st_data(., st_local(s))) 2334 | if (st_local(s+"cat")!="") return(x) // use existing values only 2335 | r = rows(x) 2336 | rnew = st_numscalar(st_local(s+"_K")) 2337 | min = st_numscalar(st_local(s+"_LB")) 2338 | wd = st_numscalar(st_local(s+"_WD")) 2339 | if (anyof(("tight", "ltight"), st_local(s+"tight"))) { 2340 | if (st_local("hexagon")!="") { 2341 | if (s=="y") min = min + wd/3 2342 | else min = min + wd/4 2343 | } 2344 | else min = min + wd/2 2345 | } 2346 | xnew = J(rnew, 1, .) 2347 | j = 1 2348 | for (i=1; i<=rnew; i++) { 2349 | xi = min + (i-1)*wd 2350 | ll = xi - wd/4 2351 | while (x[j]=ll & x[j]<=(xi+wd/4)) xnew[i] = x[j] 2356 | else xnew[i] = xi 2357 | } 2358 | return(xnew) 2359 | } 2360 | 2361 | void cliparea(real scalar hex) 2362 | { 2363 | real scalar xlb, xub, ylb, yub 2364 | real scalar ixlb, ixub, iylb, iyub, i 2365 | real colvector x, xh, y, yh, A 2366 | pragma unset A 2367 | 2368 | x = st_data(., st_local("x")) 2369 | y = st_data(., st_local("y")) 2370 | st_view(A, ., st_local("AREA")) 2371 | i = rows(x) 2372 | xlb = st_numscalar(st_local("x_LB")) 2373 | xub = st_numscalar(st_local("x_UB")) 2374 | ylb = st_numscalar(st_local("y_LB")) 2375 | yub = st_numscalar(st_local("y_UB")) 2376 | if (hex==0) { 2377 | if (st_local("xWDvar")!="") xh = st_data(., st_local("x_WD"))/2 2378 | else xh = J(i,1,st_numscalar(st_local("x_WD"))/2) 2379 | if (st_local("yWDvar")!="") yh = st_data(., st_local("y_WD"))/2 2380 | else yh = J(i,1,st_numscalar(st_local("y_WD"))/2) 2381 | for (; i; i--) { 2382 | ixlb = x[i] - xh[i]; ixub = x[i] + xh[i] 2383 | iylb = y[i] - yh[i]; iyub = y[i] + yh[i] 2384 | if (ixlb>=xlb & ixub<=xub & iylb>=ylb & iyub<=yub) continue 2385 | ixlb = max((ixlb, xlb)); ixub = min((ixub, xub)) 2386 | iylb = max((iylb, ylb)); iyub = min((iyub, yub)) 2387 | A[i] = (ixub-ixlb) * (iyub-iylb) 2388 | } 2389 | return 2390 | } 2391 | xh = st_numscalar(st_local("x_WD"))/2 2392 | yh = st_numscalar(st_local("y_WD"))/3 2393 | for (; i; i--) { 2394 | if ((y[i]-2*yh)>=ylb & (y[i]+2*yh)<=yub & 2395 | (x[i]-xh)>=xlb & (x[i]+xh)<=xub) continue 2396 | A[i] = hexarea(hexcoord(y[i], yh, ylb, yub, x[i], xh, xlb, xub)) 2397 | } 2398 | } 2399 | 2400 | real scalar hexarea(real matrix YX) 2401 | { // compute area covered by polygon (points assumed counter clockwise) 2402 | real scalar i 2403 | real colvector a, Y, X 2404 | 2405 | Y = YX[,1] \ YX[1,1] // close the polygon 2406 | X = YX[,2] \ YX[1,2] 2407 | i = rows(YX) 2408 | a = J(i,1,.) 2409 | for (; i; i--) a[i] = ((Y[i]+Y[i+1])/2 - Y[1]) * (X[i] - X[i+1]) 2410 | return(sum(a)) 2411 | } 2412 | 2413 | real matrix hexcoord( 2414 | real scalar y, real scalar yh, real scalar ylb, real scalar yub, 2415 | real scalar x, real scalar xh, real scalar xlb, real scalar xub) 2416 | { // generate (possibly clipped) hexagon coordinates around (y,x) 2417 | // assuming ylb<=yub, xlb<=xub 2418 | // assuming that at least part of the hexagon is within the bounds 2419 | real colvector Y, X 2420 | 2421 | // define hexagon 2422 | Y = X = J(8,1,.) 2423 | Y[1] = Y[2] = y - 2*yh 2424 | Y[5] = Y[6] = y + 2*yh 2425 | Y[3] = Y[8] = y - yh 2426 | Y[4] = Y[7] = y + yh 2427 | X[1] = X[2] = x 2428 | X[5] = X[6] = x 2429 | X[3] = X[4] = x + xh 2430 | X[7] = X[8] = x - xh 2431 | // check whether clipping is needed 2432 | if (ylb<=Y[1] & yub>=Y[5] & xlb<=X[7] & xub>=X[3]) return((Y:-y, X:-x)) 2433 | // apply clipping 2434 | _hexcoord_clip_bt(Y, X, 1, 2, 1, ylb, yub, xlb, xub, yh, xh) 2435 | _hexcoord_clip_bt(Y, X, 6, 5, 0, ylb, yub, xlb, xub, yh, xh) 2436 | _hexcoord_clip_lr(Y, X, 4, 3, 0, ylb, yub, xlb, xub, yh, xh) 2437 | _hexcoord_clip_lr(Y, X, 7, 8, 1, ylb, yub, xlb, xub, yh, xh) 2438 | return((Y:-y, X:-x)) 2439 | } 2440 | 2441 | void _hexcoord_clip_bt(real colvector Y, real colvector X, 2442 | real scalar a, real scalar b, real scalar up, 2443 | real scalar ylb, real scalar yub, real scalar xlb, real scalar xub, 2444 | real scalar yh, real scalar xh) 2445 | { 2446 | real scalar lb, ub, s, d 2447 | 2448 | if (up) {; lb = ylb ; ub = Y[a]; } 2449 | else {; lb = Y[a]; ub = yub ; } 2450 | s = yh / xh 2451 | if (lb>ub) { 2452 | d = lb - ub 2453 | Y[a] = Y[b] = (up ? lb : ub) 2454 | if (d<=yh) { 2455 | X[a] = X[a] - d/s 2456 | X[b] = X[b] + d/s 2457 | } 2458 | else if (d<=3*yh) { 2459 | X[a] = X[a] - xh 2460 | X[b] = X[b] + xh 2461 | } 2462 | else if (d<=4*yh) { 2463 | X[a] = X[a] - (4*yh-d)/s 2464 | X[b] = X[b] + (4*yh-d)/s 2465 | } 2466 | if (xlb>X[a]) X[a] = xlb 2467 | if (X[b]>xub) X[b] = xub 2468 | } 2469 | else if (xlb>X[a]) { 2470 | d = xlb - X[a] 2471 | X[a] = X[b] = xlb 2472 | Y[a] = Y[b] = Y[a] + (up ? 1 : -1) * d * s 2473 | } 2474 | else if (X[a]>xub) { 2475 | d = X[a] - xub 2476 | X[a] = X[b] = xub 2477 | Y[a] = Y[b] = Y[a] + (up ? 1 : -1) * d * s 2478 | } 2479 | } 2480 | 2481 | void _hexcoord_clip_lr(real colvector Y, real colvector X, 2482 | real scalar a, real scalar b, real scalar up, 2483 | real scalar ylb, real scalar yub, real scalar xlb, real scalar xub, 2484 | real scalar yh, real scalar xh) 2485 | { 2486 | real scalar lb, ub, s, d 2487 | 2488 | if (up) {; lb = xlb ; ub = X[a]; } 2489 | else {; lb = X[a]; ub = xub ; } 2490 | s = yh / xh 2491 | if (lb>ub) { 2492 | d = lb - ub 2493 | X[a] = X[b] = (up ? lb : ub) 2494 | if (d<=xh) { 2495 | Y[a] = Y[a] + d*s 2496 | Y[b] = Y[b] - d*s 2497 | } 2498 | else if (d<=2*xh) { 2499 | Y[a] = Y[a] + (2*xh-d)*s 2500 | Y[b] = Y[b] - (2*xh-d)*s 2501 | } 2502 | if (Y[a]>yub) Y[a] = yub 2503 | if (ylb>Y[b]) Y[b] = ylb 2504 | } 2505 | else { 2506 | if (ylb>Y[b]) { 2507 | d = max((ylb - Y[b] - 2*yh, 0)) 2508 | Y[b] = ylb 2509 | X[b] = X[b] + (up ? 1 : -1) * d/s 2510 | if (ylb>Y[a]) { 2511 | d = ylb - Y[a] 2512 | Y[a] = ylb 2513 | X[a] = X[a] + (up ? 1 : -1) * d/s 2514 | } 2515 | } 2516 | if (Y[a]>yub) { 2517 | d = max((Y[a] - yub - 2*yh, 0)) 2518 | Y[a] = yub 2519 | X[a] = X[a] + (up ? 1 : -1) * d/s 2520 | if (Y[b]>yub) { 2521 | d = Y[b] - yub 2522 | Y[b] = yub 2523 | X[b] = X[b] + (up ? 1 : -1) * d/s 2524 | } 2525 | } 2526 | } 2527 | } 2528 | 2529 | void fillinhexcoords() 2530 | { 2531 | real scalar i, xlb, xub, ylb, yub, xh, yh 2532 | real colvector x, y, X, Y 2533 | real matrix C 2534 | pragma unset X 2535 | pragma unset Y 2536 | 2537 | x = st_data(., st_local("x")) 2538 | y = st_data(., st_local("y")) 2539 | st_view(X, ., st_local("X")) 2540 | st_view(Y, ., st_local("Y")) 2541 | xlb = st_numscalar(st_local("x_LB")) 2542 | xub = st_numscalar(st_local("x_UB")) 2543 | ylb = st_numscalar(st_local("y_LB")) 2544 | yub = st_numscalar(st_local("y_UB")) 2545 | xh = st_numscalar(st_local("x_WD"))/2 2546 | yh = st_numscalar(st_local("y_WD"))/3 2547 | for (i = rows(x); i; i--) { 2548 | C = hexcoord(y[i], yh, ylb, yub, x[i], xh, xlb, xub) 2549 | Y[|i-8 \ i-1|] = C[,1] 2550 | X[|i-8 \ i-1|] = C[,2] 2551 | i = i-8 2552 | } 2553 | } 2554 | 2555 | string scalar quotetokens(string scalar s0) 2556 | { 2557 | real scalar i 2558 | string scalar s, space 2559 | string colvector S 2560 | pragma unset s 2561 | pragma unset space 2562 | 2563 | S = tokens(s0) 2564 | for (i=length(S); i; i--) { 2565 | s = "`" + `"""' + S[i] + `"""' + "'" + space + s 2566 | space = " " 2567 | } 2568 | return(s) 2569 | } 2570 | 2571 | void swapxy() 2572 | { 2573 | real scalar i 2574 | string scalar tmp 2575 | string colvector l 2576 | 2577 | l = ("0", "", "_K", "_LB", "_UB", "_MIN", "_MAX", "_WD", "tight", "clip", 2578 | "cat", "discrete", "label", "title") 2579 | for (i=cols(l); i; i--) { 2580 | tmp = st_local("x"+l[i]) 2581 | st_local("x"+l[i], st_local("y"+l[i])) 2582 | st_local("y"+l[i], tmp) 2583 | } 2584 | tmp = st_local("X") 2585 | st_local("X", st_local("Y")) 2586 | st_local("Y", tmp) 2587 | } 2588 | 2589 | void generatescalecoords(real rowvector cuts, string rowvector vnames) 2590 | { 2591 | real scalar i, j, n, lo, up 2592 | real matrix coord // ID Y X 2593 | 2594 | n = cols(cuts) 2595 | coord = J(5*(n-1), 3, .) 2596 | j = 0 2597 | up = cuts[1] 2598 | 2599 | for (i=1;ist_nobs()) { 2610 | st_local("ramp_N_preserve", strofreal(n-st_nobs())) 2611 | st_addobs(n-st_nobs()) 2612 | } 2613 | st_store((1,n), vnames, coord) 2614 | } 2615 | 2616 | end 2617 | 2618 | exit 2619 | -------------------------------------------------------------------------------- /heatplot.pkg: -------------------------------------------------------------------------------- 1 | v 3 2 | d heatplot: module to create heat plots and hexagon plots 3 | d 4 | d Author: Ben Jann, University of Bern, ben.jann@unibe.ch 5 | d 6 | d Distribution-Date: 20210824 7 | f hexplot.sthlp 8 | f heatplot.sthlp 9 | f heatplot.ado 10 | f hexplot.ado 11 | -------------------------------------------------------------------------------- /heatplot.sthlp: -------------------------------------------------------------------------------- 1 | {smcl} 2 | {* 23jul2021}{...} 3 | {hi:help heatplot}{right:{browse "http://github.com/benjann/heatplot/"}} 4 | {hline} 5 | 6 | {title:Title} 7 | 8 | {pstd}{hi:heatplot} {hline 2} Command to create heat plots 9 | 10 | 11 | {title:Syntax} 12 | 13 | {pstd} 14 | Syntax 1: Heat plot from variables 15 | 16 | {p 8 15 2} 17 | {cmd:heatplot} [{it:z}] {it:y} {it:x} {ifin} {weight} 18 | [{cmd:,} 19 | {help heatplot##zopt:{it:z_options}} 20 | {help heatplot##yxopt:{it:yx_options}} 21 | {help heatplot##gropt:{it:graph_options}} 22 | {help heatplot##genopt:{it:generate_options}} 23 | ] 24 | 25 | {pmore} 26 | where {it:z} is a numeric variable (assumed constant if omitted), {it:y} 27 | is a numeric variable or a string variable, and {it:x} is a numeric variable 28 | or a string variable. Categorical {it:y} and {it:x} variables can be specified 29 | as {cmd:i.}{it:varname}. 30 | 31 | {pstd} 32 | Syntax 2: Heat plot from Mata matrix 33 | 34 | {p 8 15 2} 35 | {cmd:heatplot} {opt mata(name)} 36 | [{cmd:,} 37 | {help heatplot##zopt:{it:z_options}} 38 | {help heatplot##yxopt:{it:yx_options}} 39 | {help heatplot##mopt:{it:matrix_options}} 40 | {help heatplot##gropt:{it:graph_options}} 41 | {help heatplot##genopt:{it:generate_options}} 42 | ] 43 | 44 | {pmore} 45 | where {it:name} is a numeric {help mata:Mata matrix} (contents = {it:z}, row index = {it:y}, 46 | column index = {it:x}). 47 | 48 | {pstd} 49 | Syntax 3: Heat plot from Stata matrix 50 | 51 | {p 8 15 2} 52 | {cmd:heatplot} {it:matname} [{cmd:,} 53 | {help heatplot##zopt:{it:z_options}} 54 | {help heatplot##mopt:{it:matrix_options}} 55 | {help heatplot##gropt:{it:graph_options}} 56 | {help heatplot##genopt:{it:generate_options}} 57 | ] 58 | 59 | {pmore} 60 | where {it:matname} is a {help matrix:Stata matrix} 61 | (contents = {it:z}, row names = {it:y}, column names = {it:x}). 62 | 63 | 64 | {synoptset 22}{...} 65 | {marker zopt}{synopthdr:z_options} 66 | {synoptline} 67 | {synopt :{helpb heatplot##levels:{ul:lev}els({it:#})}}number of color bins 68 | {p_end} 69 | {synopt :{helpb heatplot##cuts:{ul:cut}s({it:numlist})}}custom cut points for color bins 70 | {p_end} 71 | {synopt :{helpb heatplot##colors:{ul:c}olors({it:palette})}}color map to be used for the color bins 72 | {p_end} 73 | {synopt :{helpb heatplot##backfill:{ul:backf}ill{sf:[}({it:options}){sf:]}}}fill background using first or last color 74 | {p_end} 75 | {synopt :{helpb heatplot##statistic:{ul:s}tatistic({it:stat})}}(syntax 1 and 2 only) type of aggregation 76 | {p_end} 77 | {synopt :{helpb heatplot##fast:fast}}(syntax 1 and 2 only) use fast aggregation; requires {helpb gtools} 78 | {p_end} 79 | {synopt :{helpb heatplot##normalize:{ul:norm}alize}}normalize z values by the size of the covered area 80 | {p_end} 81 | {synopt :{helpb heatplot##transform:{ul:trans}form({it:@exp})}}transform z values before creating color bins 82 | {p_end} 83 | {synopt :{helpb heatplot##size:size}}scale size of color fields by absolute values of z 84 | {p_end} 85 | {synopt :{helpb heatplot##size:size({it:spec})}}scale size of color fields by alternative values 86 | {p_end} 87 | {synopt :{helpb heatplot##sizeprop:{ul:sizep}rop}}(syntax 1 only) scale size of color fields by relative frequencies 88 | {p_end} 89 | {synopt :{helpb heatplot##recenter:{ul:rec}enter}}(syntax 1 only) recenter color fields at data center within field 90 | {p_end} 91 | {synopt :{helpb heatplot##srange:{ul:sr}ange({it:lo} {sf:[}{it:up}{sf:]})}}set range of relative sizes of color fields 92 | {p_end} 93 | {synopt :{helpb heatplot##missing:{ul:m}issing{sf:[}({it:options}){sf:]}}}display missing values 94 | {p_end} 95 | {synopt :{helpb heatplot##values:{ul:val}ues{sf:[}({it:options}){sf:]}}}display numeric values as marker labels 96 | {p_end} 97 | {synopt :{helpb heatplot##hexagon:{ul:hex}agon{sf:[}({it:options}){sf:]}}}display color fields as hexagons 98 | instead of rectangles; also see {helpb hexplot} 99 | {p_end} 100 | {synopt :{helpb heatplot##scatter:scatter{sf:[}({it:palette}){sf:]}}}display marker symbols instead of color fields 101 | {p_end} 102 | {synopt :{helpb heatplot##keylabels:{ul:keyl}abels({it:spec})}}determine how color fields are labelled in the legend 103 | {p_end} 104 | {synopt :{helpb heatplot##ramp:ramp{sf:[}({it:options}){sf:]}}}display a color ramp instead of the legend 105 | {p_end} 106 | {synopt :{helpb heatplot##p:p{sf:[}{it:#}{sf:]}({it:area_options})}}detailed rendering of color fields 107 | {p_end} 108 | {synoptline} 109 | 110 | {synoptset 22}{...} 111 | {marker yxopt}{synopthdr:yx_options} 112 | {synoptline} 113 | {synopt :{helpb heatplot##bins:{sf:[}x{sf:|}y{sf:]}bins({it:spec})}}how 114 | continuous {it:y} and {it:x} are binned 115 | {p_end} 116 | {synopt :{helpb heatplot##bins:{sf:[}{ul:x}{sf:|}{ul:y}{sf:]}{ul:bw}idth({it:spec})}}how 117 | continuous {it:y} and {it:x} are binned; alternative to {cmd:bins()} 118 | {p_end} 119 | {synopt :{helpb heatplot##bcuts:{sf:[}{ul:x}{sf:|}{ul:y}{sf:]}{ul:bc}uts({it:numlist})}}how 120 | continuous {it:y} and {it:x} are binned; alternative to {cmd:bins()} 121 | {p_end} 122 | {synopt :{helpb heatplot##discrete:{sf:[}{ul:x}{sf:|}{ul:y}{sf:]}{ul:discr}ete{sf:[}({it:#}){sf:]}}}treat variables 123 | as discrete and omit binning 124 | {p_end} 125 | {synopt :{helpb heatplot##clip:{sf:[}{ul:l}{sf:|}{ul:r}|{ul:b}{sf:|}{ul:t}{sf:]}clip}}clip color fields at outer bounds 126 | {p_end} 127 | {synopt :{helpb heatplot##fillin:{ul:fill}in({it:#} {sf:[}{it:#}{sf:]})}}(syntax 1 only) fill in empty combinations of (binned) 128 | {it:y} and {it:x} by setting {it:z} to {it:#} 129 | {p_end} 130 | {synoptline} 131 | 132 | {synoptset 22}{...} 133 | {marker mopt}{synopthdr:matrix_options} 134 | {synoptline} 135 | {synopt :{helpb heatplot##drop:drop({it:numlist})}}drop elements equal to one of the values in {it:numlist} 136 | {p_end} 137 | {synopt :{helpb heatplot##lower:lower}}only display lower triangle 138 | {p_end} 139 | {synopt :{helpb heatplot##upper:upper}}only display upper triangle 140 | {p_end} 141 | {synopt :{helpb heatplot##nodiagonal:{ul:nodiag}onal}}omit diagonal 142 | {p_end} 143 | {synopt :{helpb heatplot##equations:{ul:eq}uations{sf:[}({it:line_opts}){sf:]}}}(syntax 3 only) label and outline equations 144 | {p_end} 145 | {synoptline} 146 | 147 | {synoptset 22}{...} 148 | {marker gropt}{synopthdr:graph_options} 149 | {synoptline} 150 | {synopt :{helpb heatplot##nograph:{ul:nogr}aph}}do not create a graph 151 | {p_end} 152 | {synopt :{helpb heatplot##label:{sf:[}{ul:no}{sf:]}{ul:l}abel}}do/do not use variable/value labels 153 | {p_end} 154 | {synopt :{helpb heatplot##addplot:addplot({it:plot})}}add other plots to the graph 155 | {p_end} 156 | {synopt :{helpb heatplot##addplotnopreserve:{ul:addplotnopr}eserve}}technical option relevant for {cmd:addplot()} 157 | {p_end} 158 | {synopt :{helpb heatplot##by:by({it:varlist}{sf:[}, {it:byopts}{sf:]})}}(syntax 1 only) repeat plot by subgroups 159 | {p_end} 160 | {synopt :{helpb heatplot##twopts:{it:twoway_options}}}general twoway options 161 | {p_end} 162 | {synoptline} 163 | 164 | {synoptset 22}{...} 165 | {marker genopt}{synopthdr:generate_options} 166 | {synoptline} 167 | {synopt :{helpb heatplot##generate:{ul:gen}erate{sf:[}({it:namelist}){sf:]}}}store 168 | the plotted data as variables 169 | {p_end} 170 | {synopt :{helpb heatplot##replace:{ul:r}eplace}}allow overwriting existing variables 171 | {p_end} 172 | {synopt :{helpb heatplot##nopreserve:{ul:nopres}erve}}replace the original data by 173 | the plotted data 174 | {p_end} 175 | {synoptline} 176 | 177 | {pstd} 178 | {cmd:fweight}s, {cmd:aweight}s, {cmd:iweight}s, and {cmd:pweight}s are allowed with Syntax 1; see help {help weight}. 179 | 180 | 181 | {title:Description} 182 | 183 | {pstd} 184 | {cmd:heatplot} creates heat plots from variables or matrices. One 185 | example of a heat plot is a two-dimensional histogram in which the 186 | frequencies of combinations of binned {it:y} and {it:x} are displayed as 187 | rectangular (or hexagonal) fields using a color gradient. Another example 188 | is a plot of a trivariate distribution where the color gradient is used to 189 | visualize the (average) value of {it:z} within bins of {it:y} and 190 | {it:x}. Yet another example is a plot that displays the contents of a matrix, 191 | say, a correlation matrix or a spacial weights matrix, using a color 192 | gradient. For a selection of different applications, see the 193 | {help heatplot##examples:Examples} below. 194 | 195 | 196 | {title:Dependencies} 197 | 198 | {pstd} 199 | {cmd:heatplot} requires {cmd:palettes} (Jann 2018) and, in Stata 14.2 or newer, 200 | {cmd:colrspace} (Jann 2019a). To install these packages, type 201 | 202 | {com}. ssc install palettes, replace 203 | . ssc install colrspace, replace{txt} 204 | 205 | {pstd} 206 | The {cmd:fast} option requires {cmd:gtools} (Caceres Bravo 2018). To install 207 | {cmd:gtools}, type 208 | 209 | {com}. ssc install gtools, replace 210 | . gtools, upgrade{txt} 211 | 212 | 213 | {title:Options} 214 | 215 | {dlgtab:z_options} 216 | 217 | {marker levels}{...} 218 | {phang} 219 | {opt levels(#)} sets the number of color bins into which {it:z} (or the observed frequencies, if 220 | {it:z} is omitted) is categorized, using a regular grid of intervals from the observed 221 | minimum of (aggregated) {it:z} to the observed maximum. The intervals are right-open, except 222 | for the last interval interval, which is right-closed. Only one of 223 | {cmd:levels()} and {cmd:cuts()} is allowed. 224 | 225 | {marker cuts}{...} 226 | {phang} 227 | {opt cuts(numlist)} specifies the thresholds used to categorize {it:z}, 228 | where {it:numlist} is an ascending number list; see help 229 | {it:{help numlist}}. The intervals defined by the number list are right-open, except 230 | for the last interval, which is right-closed. If the smallest number is 231 | larger than the observed minimum of (aggregated) {it:z}, an extra interval 232 | will be added at the bottom. Likewise, if the largest number is smaller than the 233 | maximum, an extra interval will be added at the top. Within the number list, 234 | you can use {cmd:@min} and {cmd:@max} to denote the 235 | observed minimum and maximum, respectively. For example, you could type 236 | 237 | {cmd:@min(20)@max} 238 | 239 | {pmore} 240 | to create intervals from the minimum to the maximum in steps of 20. Furthermore, 241 | expressions specified as {cmd:{c -(}}{it:{help exp}}{cmd:{c )-}} will be evaluated 242 | before expanding the number list. For example, you could type 243 | 244 | {cmd:{c -(}@min-.5{c )-}(1){@max+.5}} 245 | 246 | {pmore} 247 | to create intervals from the observed minimum minus 0.5 to the observed maximum 248 | plus 0.5 in steps of 1. 249 | 250 | {marker colors}{...} 251 | {phang} 252 | {cmd:colors(}[{it:{help colorpalette##palette:palette}}] [{cmd:,} 253 | {it:{help colorpalette##opts:palette_options}}]{cmd:)} 254 | selects the color palette to be used for the bins of {it:z}. This also 255 | sets the number of bins, unless {cmd:levels()} or {cmd:cuts()} is specified. {it:palette} is any 256 | palette allowed by {helpb colorpalette} (which can also be a simple list of colors, see 257 | help {it:{help colorpalette##colorlist:colorlist}}) and {it:palette_options} are 258 | corresponding options. The default is {cmd:colors(viridis)}. For example, to use a 259 | red to blue HCL color scheme, you could type {cmd:colors(hcl bluered, reverse)}. In 260 | Stata versions older than 14.2, {cmd:heatplot} uses command 261 | {helpb colorpalette9} instead of {helpb colorpalette} 262 | and sets the default to {cmd:colors(hcl, viridis)}. 263 | 264 | {marker backfill}{...} 265 | {phang} 266 | {cmd:backfill}[{cmd:(}{it:options}{cmd:)}] fills the background (the plotregion) 267 | using the first color of the colors provided by {cmd:colors()}. This makes 268 | sense, for example, in a bivariate histogram. When applying {cmd:backfill}, 269 | you may want to turn grid lines off; in most situations this can be achieved by 270 | typing {cmd:ylabel(, nogrid)} and/or {cmd:xlabel(, nogrid)}. {it:options} are: 271 | 272 | {phang2} 273 | {cmdab:l:ast} uses the last color instead of the first color. 274 | 275 | {phang2} 276 | {cmdab:i:nner} only colors the inner plotregion. The default is to color both, 277 | the inner and the outer plotregion. 278 | 279 | {marker statistic}{...} 280 | {phang} 281 | {opt statistic(stat)} sets the type of aggregation of z values within 282 | yx-bins. {opt statistic()} is only allowed in syntax 1 and 2 (in syntax 3, 283 | each cell of the specified matrix constitutes a separate color field; hence, there is 284 | no aggregation). {it:stat} can be any statistic supported by {helpb collapse}. In 285 | addition, {it:stat} can be {cmd:proportion} to compute proportions (rather 286 | than percentages) or {cmd:asis} to skip aggregation. The main purpose of 287 | {cmd:asis} is to save computer time in cases where no aggregation is 288 | needed; typically, you only want to specify {cmd:asis} if you are certain 289 | that all combinations of (binned) {it:y} and {it:x} are unique. 290 | 291 | {pmore} 292 | In syntax 1, if variable {it:z} is provided, the default is 293 | {cmd:statistic(mean)}; if {it:z} is omitted, the default is 294 | {cmd:statistic(percent)}. In syntax 2, the default is {cmd:statistic(sum)}, 295 | or, if {cmd:discrete} has been specified, {cmd:statistic(asis)}. 296 | 297 | {marker fast}{...} 298 | {phang} 299 | {opt fast} performs some of the computations using fast commands provided 300 | by {helpb gtools} (e.g. {helpb gcollapse} instead of official {helpb collapse} 301 | for aggregation). Use this option to speed up computations in very large 302 | datasets. The {helpb gtools} package (Caceres Bravo 2018) has to be 303 | installed on the system; see {browse "http://github.com/mcaceresb/stata-gtools"} 304 | for more information. Option {opt fast} is only allowed in syntax 1 and 2. 305 | 306 | {marker normalize}{...} 307 | {phang} 308 | {opt normalize} causes the (aggregated) z values to be normalized by the size 309 | of the area covered by a color field before assigning colors. For example, specifying 310 | {cmd:normalize} together with {cmd:statistic(proportion)} will visualize densities 311 | instead of proportions. {cmd:normalize} may be useful if you clip the color 312 | fields using option {helpb heatplot##clip:clip} or if you apply option 313 | {helpb heatplot##bcuts:bcuts()} in a way such that color fields have different 314 | sizes. The area covered by a color field will be computed before rescaling 315 | the fields according to {helpb heatplot##size:size()} or 316 | {helpb heatplot##sizeprop:sizeprop}. If option 317 | {helpb heatplot##scatter:scatter} is specified, the sizes of the areas 318 | will be computed as if {cmd:scatter} was not specified. If 319 | option {helpb heatplot##hexagon:hexagon} is specified, the computations will 320 | be based on the (possibly clipped) shapes of the hexagons. {cmd:normalize} 321 | has no effect in syntax 3. 322 | 323 | {marker transform}{...} 324 | {phang} 325 | {opt transform(@exp)} causes the (aggregated and, possibly, normalized) z 326 | values to be transformed before assigning colors. {it:@exp} is 327 | an expression (see {it:{help exp}}) in which {cmd:@} acts as a placeholder 328 | for the values to be transformed. For example, to take the natural logarithm, 329 | type {cmd:transform(ln(@))}. 330 | 331 | {marker size}{...} 332 | {phang} 333 | {opt size}[{cmd:(}{it:spec}{cmd:)}] scales the sizes of the color fields. If 334 | {cmd:size} is specified without argument, the color fields will be scaled in 335 | proportion to the absolute value of (aggregated and, possibly, 336 | normalized) z. Alternatively, provide a custom source for the scaling as 337 | follows. 338 | 339 | {pmore} 340 | In syntax 1, specify 341 | {cmd:size(}{it:exp}[{cmd:,} {cmdab:s:tatistic(}{it:stat}{cmd:)}]{cmd:)} 342 | to obtain the size information from {it:{help exp}} (typically, {it:exp} 343 | is a simple variable name). Observations for which {it:exp} is missing will 344 | {it:not} be excluded from the estimation sample; if {it:exp} is missing for 345 | all observations within a specific color field, the size of the field will 346 | be set to the minimum as set by {helpb heatplot##srange:srange()}. 347 | 348 | {pmore} 349 | In syntax 2, specify 350 | {cmd:size(}{it:name}[{cmd:,} {cmdab:s:tatistic(}{it:stat}{cmd:)}]{cmd:)} 351 | to obtain the size information from Mata matrix {it:name}. The matrix must 352 | be numeric and have the same dimension as the main matrix. 353 | 354 | {pmore} 355 | In syntax 3, specify {opt size(matname)} to obtain the size information 356 | from Stata matrix {it:matname}. The matrix must have the same dimension 357 | as the main matrix. 358 | 359 | {pmore} 360 | In syntax 1 and 2, suboption {opt statistic(stat)} sets the type 361 | of aggregation, where {it:stat} can be any statistic supported by 362 | {helpb collapse} (the default is {cmd:mean}). Suboption {cmd:statistic()} 363 | is only relevant if the main {helpb heatplot##statistic:statistic()} option 364 | has not been set to {cmd:asis}. In any case, absolute values of the 365 | (possibly aggregated) information will be used for the scaling. 366 | 367 | {marker sizeprop}{...} 368 | {phang} 369 | {opt sizeprop} scales the size of the color fields in proportion to the 370 | relative frequency of the underlying data. Use {cmd:sizeprop} as an 371 | alternative to {cmd:size()}. {opt sizeprop} is only allowed in syntax 1. 372 | 373 | {marker recenter}{...} 374 | {phang} 375 | {opt recenter} moves the color fields such that their center is at 376 | the center of the included data (mean of y and x within the bin). This can 377 | be useful if {cmd:size()} or {cmd:sizeprop} has been specified. {cmd:recenter} is only allowed 378 | in syntax 1. 379 | 380 | {marker srange}{...} 381 | {phang} 382 | {cmd:srange(}{it:lo} [{it:up}]{cmd:)} sets the range of relative sizes of 383 | the color fields. {cmd:srange()} is only relevant if {cmd:size()} or 384 | {cmd:sizeprop} has been specified. Let {it:v}, {it:v}>=0, be the variable to which the 385 | field sizes should be proportional (e.g. relative frequencies). The field 386 | sizes are then computed as {it:lo} + {it:v}/max({it:v}) * 387 | ({it:up} - {it:lo}). The default is {it:lo}=0 and {it:up}=1, that is, the smallest 388 | possible field has size 0 (invisible) and the largest field has size 1 389 | (full size). Specify, for example, {cmd:srange(0.5)} to set the size of the 390 | smallest possible field to 0.5 (half of full size). 391 | 392 | {marker missing}{...} 393 | {phang} 394 | {opt missing}[{opt (options)}] displays color fields for 395 | combinations of (binned) y and x for which (aggregated) 396 | z is equal to missing. {it:options} are: 397 | 398 | {phang2} 399 | {opt l:abel(label)} sets the text to be used for missing values in the legend. The 400 | default is {cmd:label("missing")}. 401 | 402 | {phang2} 403 | {it:area_options} are options to affect the look of the missing 404 | value color fields; see help {it:{help area_options}}. The default is to 405 | display the fields in black. 406 | 407 | {marker values}{...} 408 | {phang} 409 | {opt values}[{opt (options)}] displays the z values (or other information) 410 | as marker labels in the middle of the color fields. {it:options} are: 411 | 412 | {phang2} 413 | {opt l:abel(spec)} determines the source of the values of the labels. The 414 | default is to display the (aggregated and, possibly, normalized) z 415 | values. Alternatively, provide a custom source as follows. 416 | 417 | {pmore2} 418 | In syntax 1, specify 419 | {cmd:label(}{it:exp}[{cmd:,} {cmdab:s:tatistic(}{it:stat}{cmd:)}]{cmd:)} 420 | to display the (aggregated) value of {it:{help exp}} (typically, {it:exp} 421 | is a simple variable name). Observations for which {it:exp} is missing will 422 | {it:not} be excluded from the estimation sample. The default for 423 | {it:stat} is {cmd:mean} if {it:exp} is numeric and {cmd:first} if {it:exp} is string. 424 | 425 | {pmore2} 426 | In syntax 2, specify 427 | {cmd:label(}{it:name}[{cmd:,} {cmdab:s:tatistic(}{it:stat}{cmd:)}]{cmd:)} 428 | to obtain the values from Mata matrix {it:name}. The matrix must 429 | be numeric and have the same dimension as the main matrix. The default for 430 | {it:stat} is {cmd:mean}. 431 | 432 | {pmore2} 433 | In syntax 3, specify {opt label(matname)} to obtain the values 434 | from Stata matrix {it:matname}. The matrix must have the same dimension 435 | as the main matrix. 436 | 437 | {pmore2} 438 | In syntax 1 and 2, suboption {opt statistic(stat)} sets the type of 439 | aggregation, where {it:stat} can be any statistic supported by 440 | {helpb collapse}. The suboption is only relevant if the main 441 | {helpb heatplot##statistic:statistic()} option has not been set to {cmd:asis}. 442 | 443 | {phang2} 444 | {opt trans:form(@exp)} causes the values to be transformed before 445 | displaying. {it:@exp} is an expression (see {it:{help exp}}) in which 446 | {cmd:@} acts as a placeholder for the values to be transformed. The result 447 | of {it:@exp} may be numeric or string; for example, you could type 448 | {cmd:transform(cond(@>0, "+", cond(@<0, "-", "")))} to display "+" or "-" 449 | (or nothing) depending on whether the value is (strictly) positive or 450 | (strictly) negative. 451 | 452 | {phang2} 453 | {opth sty:le(markerlabelstyle)} sets the overall style of the labels. 454 | 455 | {phang2} 456 | {opth p:osition(clockposstyle)} specifies where the label is to be located 457 | relative to the middle of the color field. The default is {cmd:position(0)} 458 | (centered). 459 | 460 | {phang2} 461 | {opth g:ap(size)} specifies how much space should be put between the 462 | label the middle of the color field. This is only relevant if {cmd:position()} 463 | is not 0. 464 | 465 | {phang2} 466 | {opth ang:le(anglestyle)} specifies the angle of the text. 467 | 468 | {phang2} 469 | {opth s:ize(textsizestyle)} specifies the size of the text. 470 | 471 | {phang2} 472 | {opth c:olor(colorstyle)} specifies the color of the text. Default is 473 | {cmd:color(black)} (unless {cmd:style()} is specified in which case the 474 | color is determined by the selected style). 475 | 476 | {phang2} 477 | {opth f:ormat(%fmt)} sets the display format for the values. This option is 478 | only useful if the labels are numeric. 479 | 480 | {pmore} 481 | For more details on options {cmd:style()} through {cmd:format()} also see 482 | the corresponding options with {cmd:"mlab"} prefix in {it:{help marker_label_options}}. 483 | 484 | {marker hexagon}{...} 485 | {phang} 486 | {opt hexagon}[{opt (options)}] causes the color fields to be rendered as 487 | hexagons instead of rectangles; also see {helpb hexplot}. {cmd:hexagon} and 488 | {cmd:scatter} are not both allowed. {it:options} are: 489 | 490 | {phang2} 491 | {opt hor:izontal} arranges the hexagons horizontally. The default is to arrange 492 | the hexagons vertically. 493 | 494 | {phang2} 495 | {opt left} starts with a left-shifted hexagon row. The default is to start 496 | with a right-shifted row. If {cmd:horizontal} is specified, {cmd:left} 497 | starts with an down-shifted row instead of an up-shifted row. 498 | 499 | {phang2} 500 | {opt odd} uses an odd number of hexagon columns. The default is to use 501 | an even number of columns. That is, by default the bins on the x-axis are 502 | constructed in a way such that each bin contains a double column of hexagons, 503 | yielding an even overall number of columns. Specify {cmd:odd} to construct the 504 | bins in a way such that the last bin only contains a single column. If 505 | {cmd:horizontal} is specified, {cmd:odd} affects the number of 506 | rows rather than columns. 507 | 508 | {marker scatter}{...} 509 | {phang} 510 | {cmd:scatter}[{cmd:(}{it:{help symbolpalette##palette:palette}} [{cmd:,} 511 | {it:{help symbolpalette##opts:palette_options}}]{cmd:)}] causes the heat 512 | plot to be rendered as a scatter plot, with markers placed at the centers 513 | of the bins. Only one of {cmd:scatter} and {cmd:hexagon} is allowed. {it:palette} 514 | is any palette allowed by {helpb symbolpalette} (which can also be a simple list of 515 | symbol styles, see help {it:{help symbolpalette##symbollist:symbollist}}) and 516 | {it:palette_options} are corresponding options. The default is 517 | {cmd:scatter(circle)}. If less symbol styles are specified than there are 518 | bins of {it:z}, the symbols will be recycled. 519 | 520 | {marker keylabels}{...} 521 | {phang} 522 | {opt keylabels(spec)} selects the legend keys to be labeled and affects the 523 | formatting of the labels. {it:spec} 524 | is 525 | 526 | [{it:rule}] [{cmd:,} {it:suboptions} ] 527 | 528 | {pmore} 529 | where {it:rule} may be 530 | 531 | {p2colset 13 23 25 2}{...} 532 | {p2col:{it:{help numlist}}}label the specified keys, where 1 refers to the first key (lowest value), 2 to the second, etc. 533 | {p_end} 534 | {p2col:{cmd:minmax}}label the lower boundary of the first key and the upper boundary of the last key 535 | {p_end} 536 | {p2col:{cmd:none}}omit the labels 537 | {p_end} 538 | {p2col:{cmd:all}}label all keys; this is the default unless there are more than 24 keys 539 | {p_end} 540 | 541 | {pmore} 542 | and suboptions are: 543 | 544 | {phang2} 545 | {opth f:ormat(%fmt)} sets the display format. The default is {cmd:%7.0g}. 546 | 547 | {phang2} 548 | {opt trans:form(@exp)} causes the values to be transformed before being 549 | displayed. {it:@exp} is an expression in which {cmd:@} acts as 550 | a placeholder for the values to be transformed. Typically, {it:@exp} will be 551 | the inverse of the main {helpb heatplot##transform:transform()} 552 | option. Example: {cmd:transform(ln(@))} would go along with 553 | {cmd:keylab(,transform(exp(@))}. 554 | 555 | {phang2} 556 | {opt inter:val} uses interval notation for the key labels, e.g. [0,10), 557 | [10,20), etc. Only one of {cmd:interval} and {cmd:range()} 558 | is allowed. The default is to display interval midpoints. {cmd:interval} 559 | has no effect if {it:rule} is {cmd:minmax}. 560 | 561 | {phang2} 562 | {opt ran:ge(#)} uses range notation for the key labels. Argument {it:#} specifies by how much the upper 563 | bound should be reduced. For example, {cmd:range(1)} would display interval [10,20) as "10-19", whereas 564 | {cmd:range(0.1)} would display the interval as "10-19.9". You may want to set an appropriate 565 | display format when specifying {cmd:range()}; see {cmd:format()} above. {cmd:interval} and {cmd:range()} 566 | are not both allowed. {cmd:range()} has no effect if {it:rule} is {cmd:minmax}. 567 | 568 | {phang2} 569 | {opt area} displays legend keys as areas (rectangles) even if {cmd:scatter} has been specified. 570 | 571 | {phang2} 572 | {it:textbox_options} are general options to affect the rendering of the 573 | labels, such as {cmd:size()}; see {it:{help textbox_options}}. 574 | 575 | {phang2} 576 | {it:legend_options} are further options affecting the rendering of the 577 | legend; see {it:contents} and {it:location} in help {it:{help legend_options}}. 578 | 579 | {marker ramp}{...} 580 | {phang} 581 | {cmd:ramp}[{cmd:(}{it:options}{cmd:)}] renders the legend as a color ramp 582 | instead of using {helpb graph}'s legend 583 | option. Internally, if {cmd:ramp} is specified, two graphs are created, one 584 | for the main plot and one for the color ramp; these plots are then combined into a single graph 585 | using {helpb graph combine}. {it:options} are: 586 | 587 | {phang2} 588 | {opt l:eft}, {opt r:ight}, {opt t:op}, or {opt b:ottom} specify the location 589 | of the ramp on the final graph. The location also affects the orientation of 590 | the ramp. In case of {cmd:top} or {cmd:bottom}, the ramp 591 | will be oriented horizontally; in case of {cmd:left} or {cmd:right}, the ramp 592 | will be oriented vertically. The default is {cmd:bottom}. 593 | 594 | {phang2} 595 | {opt lab:els(rule_or_values)} specifies how the axis of the ramp 596 | should be labeled and ticked, where {it:rule_or_values} is as described in 597 | {it:{help axis_label_options}}. You may specify {cmd:@min} 598 | and {cmd:@max} to refer to the lower bound of the first interval and the 599 | upper bound of the last interval. For example, you could type 600 | {cmd:labels(@min .5 @max)} to place a label at the minimum, at 0.5, and at the maximum. Various 601 | suboptions are available to control the rendering of the labels and ticks; see 602 | {it:{help axis_label_options}}. 603 | 604 | {phang2} 605 | {opth f:ormat(%fmt)} sets the display format for the labels. The default 606 | is {cmd:%7.0g}. 607 | 608 | {phang2} 609 | {opt l:ength(#)} sets the length of the ramp as a percentage of the 610 | available space (the graph's width or height, depending on the orientation 611 | of the ramp). In horizontal orientation the default is {cmd:length(80)}; in 612 | vertical orientation the default is {cmd:length(60)}. 613 | 614 | {phang2} 615 | {opt s:pace(#)} specifies the space to be consumed by the plot containing 616 | the ramp, as a percentage of the overall size of the graph. In horizontal 617 | orientation the default is {cmd:space(12)}; in 618 | vertical orientation the default is {cmd:space(20)}. 619 | 620 | {phang2} 621 | {opt trans:form(@exp)} causes the ramp to be displayed on a transformed 622 | scale. {it:@exp} is an expression in which {cmd:@} acts as 623 | a placeholder for the values to be transformed. Typically, {it:@exp} will be 624 | the inverse of the main {helpb heatplot##transform:transform()} 625 | option. For example, {cmd:transform(ln(@))} would go along with 626 | {cmd:ramp(transform(exp(@))}. 627 | 628 | {phang2} 629 | {opt c:ombine(combine_options)} are options to be passed through to 630 | {helpb graph combine}, such as {it:{help region_options}}. Note that the 631 | following options will be collected from the main options 632 | and passed through to {helpb graph combine} automatically: {cmd:title()}, 633 | {cmd:subtitle()}, {cmd:note()}, {cmd:caption()}, {cmd:ysize()}, 634 | {cmd:xsize()}, {cmd:nodraw}, {cmd:scheme()}, {cmd:name()}, and 635 | {cmd:saving()}. 636 | 637 | {phang2} 638 | {it:{help twoway_options}} are general options to be applied to the plot 639 | containing the color ramp. 640 | 641 | {marker p}{...} 642 | {phang} 643 | {opt p}[{it:#}]{opt (area_options)} provides options to affect the rendering of 644 | the color fields; see help {it:{help area_options}}. For example, if you want 645 | the fields to have black outlines, you can type {cmd:p(lcolor(black) lalign(center))}. Unnumbered 646 | option {cmd:p()} affects all color fields. In addition, to address only the fields 647 | corresponding a specific level of z, you can type {cmd:p}{it:#}{cmd:()} 648 | where {it:#} is the number of the level. For example, if you want the color fields 649 | of the 3rd level to have red outlines, you could type 650 | {cmd:p3(lcolor(red))}. 651 | 652 | {dlgtab:yx_options} 653 | 654 | {marker bins}{...} 655 | {phang} 656 | {opt bins(spec)} or {opt bwidth(spec)}, {opt ybins(spec)} or {opt ybwidth(spec)}, 657 | and {opt xbins(spec)} or {opt xbwidth(spec)} specify how 658 | {it:y} and {it:x} are binned. These options are only allowed with 659 | continuous variables. {cmd:bins()}/{cmd:bwidth()} affects both, 660 | y and x. {cmd:ybins()}/{cmd:ybwidth()} and 661 | {opt xbins()}/{cmd:xbwidth()} only affect y or x, respectively, taking precedence over 662 | {cmd:bins()}/{cmd:bwidth()}. {it:spec} is 663 | 664 | [{it:n} {it:lb} {it:ub}] [{cmd:,} {it:suboptions} ] 665 | 666 | {pmore} 667 | for {cmd:bins()} and 668 | 669 | [{it:width} {it:lb} {it:ub}] [{cmd:,} {it:suboptions} ] 670 | 671 | {pmore} 672 | for {cmd:bwidth()} where 673 | 674 | {p2colset 13 23 25 2}{...} 675 | {p2col:{it:n}}number of bins 676 | {p_end} 677 | {p2col:{it:width}}bin width 678 | {p_end} 679 | {p2col:{it:lb}}midpoint (or lower bound) of first bin 680 | {p_end} 681 | {p2col:{it:ub}}midpoint (or upper bound) of last bin 682 | {p_end} 683 | 684 | {pmore} 685 | If neither {it:n} nor {it:width} is specified, the 686 | default is to set {it:n} to trunc(min(sqrt({it:N}), 10*ln({it:N})/ln(10))^(9/10)) 687 | (or a fraction thereof if {it:lb} and {it:ub} define a range that is 688 | smaller than the observed data range), where {it:N} is the number of 689 | observations. If {it:lb} and {it:ub} are 690 | omitted, they are set to the observed minimum and maximum of the data, 691 | respectively. Default values can also be requested 692 | by typing "{cmd:.}" instead of providing a number. For example, you could type 693 | {cmd:bins(. 0 10)} to create bins from 0 to 10 using the default number 694 | of bins. Data outside the bin range 695 | defined by {it:lb} and {it:ub} will not be displayed, but will be taken into 696 | account when computing relative frequencies. {it:suboptions} are: 697 | 698 | {phang2} 699 | {cmd:tight} makes the bins tight. By default {it:lb} and {it:ub} are 700 | interpreted as midpoints of the the first and last bins. Specify 701 | {cmd:tight} to treat {it:lb} and {it:ub} as outer bounds of the the first 702 | and last bins (in general, all bins are defined using right-open intervals; 703 | however, if {cmd:tight} is specified, observations equal to {it:ub} will 704 | be included in the last bin). If {cmd:tight} is specified together with {cmd:hexagon}, 705 | the first and last bins are made as tight as possible given the shape and 706 | arrangement of the hexagons; all data falling into the hexagons will be 707 | taken into account even if lower than {it:lb} or larger than {it:ub} 708 | (unless option {cmd:clip} is specified). 709 | 710 | {phang2} 711 | {cmd:ltight} makes the first bin tight, but leaves the last bin unchanged. 712 | 713 | {phang2} 714 | {cmd:rtight} makes the last bin tight, but leaves the first bin unchanged. 715 | 716 | {pmore} 717 | Note on setting the number of bins or the bin width for {cmd:hexagon} 718 | plots: In a grid of true hexagons, the vertical distance between hexagons 719 | midpoints is smaller than the horizontal distance by a factor of 720 | sqrt(3)/2 (or vice versa if {cmd:hexagon(horizontal)} is specified). If applying graph's 721 | {cmd:aspectratio(1)} option to produce a square plot, you may thus want to 722 | set the number of y-bins to about 2/sqrt(3) times the number x-bins. Likewise, if 723 | y and x have the same scale, you may want to set the width of y-bins to 724 | sqrt(3)/2 times the width of x-bins. 725 | 726 | {marker bcuts}{...} 727 | {phang} 728 | {opth bcuts(numlist)}, {opth ybcuts(numlist)} and {opth xbcuts(numlist)} 729 | specify how {it:y} and {it:x} are binned. Use these options as an alternative 730 | to [{cmd:y}|{cmd:x}]{cmd:bins()} or [{cmd:y}|{cmd:x}]{cmd:bwidth()}. {cmd:bcuts()} 731 | affects both, y and x; {cmd:ybcuts()} and {opt xbcuts()} only affect y or x, 732 | respectively, taking precedence over {cmd:bcuts()}. {it:numlist} is an (ascending) 733 | list of (at least two) cutpoints defining the bins. The bins are defined 734 | as right-open intervals from one cutpoint to the next (except for the last bin 735 | which is right-closed). Data smaller than the first cutpoint or larger then 736 | the last cutpoint will not be displayed, but will be taken into 737 | account when computing relative frequencies. Option {helpb heatplot##clip:clip} 738 | will have no effect for variables binned by {cmd:bcuts()}. Option 739 | {helpb heatplot##hexagon:hexagon} is not allowed with {cmd:bcuts()}. 740 | 741 | {marker discrete}{...} 742 | {phang} 743 | {opt discrete}[{opt (#)}], {opt ydiscrete}[{opt (#)}], and {opt xdiscrete}[{opt (#)}] 744 | specify that the variables are discrete and should not be binned. {cmd:discrete} 745 | affects both, y and x. {cmd:ydiscrete} and {opt xdiscrete} only affect y or x, 746 | respectively. Typically, treating variables as discrete only makes sense if 747 | their values are regularly spaced, as {cmd:heatplot} will print color fields 748 | centered at each observed value (furthermore, although allowed, specifying 749 | {helpb heatplot##hexagon:hexagon} together with {cmd:discrete} does not 750 | lead to useful results in most situations). Optional argument {it:#} 751 | specifies the step width affecting the size of the color fields. The default 752 | step width is 1. Categorical variables specified as {cmd:i.}{it:varname} are always treated as 753 | discrete, but you can still use [{cmd:y}|{cmd:x}]{opt discrete(#)} to affect the size of the color fields 754 | in this case. 755 | 756 | {marker clip}{...} 757 | {phang} 758 | {cmd:clip}, {cmd:rclip}, {cmd:lclip}, {cmd:tclip}, and {cmd:bclip} 759 | cause the color fields to be clipped at the lower and upper bounds of the 760 | data range (or the range selected by {it:lb} and {it:ub} in {cmd:bins()} or {cmd:bwidth()}; data 761 | outside the clipped range will be omitted in this case). {cmd:clip} 762 | causes clipping on all four sides, {cmd:rclip} clips on the right, 763 | {cmd:lclip} clips on the left, {cmd:tclip} clips at the top, and {cmd:bclip} 764 | clips at the bottom (any combination is allowed). Option {helpb heatplot##normalize:normalize} 765 | will only take into account the remaining area of a field after clipping. 766 | 767 | {marker fillin}{...} 768 | {phang} 769 | {opt fillin(value [size])} fills in empty combinations of (binned) {it:y} and {it:x} 770 | by setting {it:z} to {it:value}. {it:value} can be {cmd:.} to set {it:z} to missing, 771 | which will be displayed if the {cmd:missing} option has been specified. Optional {it:size} 772 | is a number between 0 and 1 that sets the relative size of the color fields created by 773 | {cmd:fillin()}; this is only relevant if {cmd:size()} or {cmd:sizeprop} 774 | has been specified. {it:size} defaults to 1. {opt fillin()} is only allowed in 775 | syntax 1. See {helpb heatplot##backfill:backfill} for a more efficient (but less 776 | flexible) approach to color empty combinations of {it:y} and {it:x}. 777 | 778 | {dlgtab:matrix_options} 779 | 780 | {marker drop}{...} 781 | {phang} 782 | {opt drop(numlist)} drops cells that have a value equal to one of the 783 | specified numbers. For example, type {cmd:drop(0)} to omit cells that contain 0. 784 | 785 | {marker lower}{...} 786 | {phang} 787 | {opt lower} causes only the lower triangle of the matrix to be displayed. Only one of {cmd:lower} and 788 | {cmd:upper} is allowed. 789 | 790 | {marker upper}{...} 791 | {phang} 792 | {opt upper} causes only the upper triangle of the matrix to be displayed. Only one of {cmd:upper} and 793 | {cmd:lower} is allowed. 794 | 795 | {marker nodiagonal}{...} 796 | {phang} 797 | {opt nodiagonal} omits the diagonal of the matrix. 798 | 799 | {marker equations}{...} 800 | {phang} 801 | {opt equations}[{cmd:(}{it:{help line_options}}{cmd:)}] uses the equation 802 | names of the matrix as axis labels, places ticks between equations, and 803 | draws outlines around diagonal equation areas. This 804 | can be useful, for example, if the matrix contains pairwise distances and the 805 | equations identify clusters. Use {it:{help line_options}} to affect the 806 | rendering of the outline. {opt equations()} is only allowed in 807 | syntax 3. 808 | 809 | {dlgtab:graph_options} 810 | 811 | {marker nograph}{...} 812 | {phang} 813 | {opt nograph} omits creating a graph. 814 | 815 | {marker label}{...} 816 | {phang} 817 | [{cmd:no}]{opt label} specifies whether variable and value labels should be used or not. In syntax 1 818 | (plot from variables) the default is to use the variable labels of {it:y} and {it:x} 819 | as axis titles and, for categorical variables, the value labels as tick 820 | labels, if variable and value labels exist. Specify {cmd:nolabel} to 821 | use variable names and values instead. In syntax 3 (plot from Stata matrix) the default is to use 822 | row and column names as tick labels. Specify {cmd:label} to instruct {cmd:heatplot} to 823 | look for corresponding variables in the dataset and use their labels. In syntax 2, {cmd:label} 824 | has no effect. 825 | 826 | {marker addplot}{...} 827 | {phang} 828 | {opt addplot(plot)} provides a way to add other plots to the generated graph; see help {it:{help addplot_option}}. 829 | 830 | {marker addplotnopreserve}{...} 831 | {phang} 832 | {opt addplotnopreserve} drops the original data when drawing the graph even if 833 | {cmd:addplot()} has been specified. By default, {cmd:heatplot} temporarily deletes the 834 | original data to speed up drawing the graph. However, if {cmd:addplot()} is 835 | specified, the original data is preserved because {cmd:addplot()} might 836 | make use of it. If you are certain that {cmd:addplot()} does not make use of the original data 837 | (for example, because it only contains a {helpb twoway_scatteri:scatteri} plot), you can 838 | specify {cmd:addplotnopreserve} to save memory and computing time in large datasets. 839 | 840 | {marker by}{...} 841 | {phang} 842 | {opt by(varlist [, byopts])} specifies that the plot should be repeated for each set of values of {varlist}; see help 843 | {it:{help by_option}} (but note that suboption {cmd:total} is not supported). {cmd:by()} is only allowed in syntax 1. Computation 844 | of relative frequencies will be across all by-groups. 845 | 846 | {marker twopts}{...} 847 | {phang} 848 | {it:twoway_options} are any other options documented in help {it:{help twoway_options}}. 849 | 850 | {dlgtab:generate_options} 851 | 852 | {marker generate}{...} 853 | {phang} 854 | {opt generate}[{opt (namelist)}] stores the plotted data as new 855 | variables. Depending on context, {cmd:generate()} might need to increase the 856 | number of rows in the dataset to store the variables (by default, five rows 857 | are required per color field, the coordinates of the four corners plus 858 | missing as delimiter; if option {helpb heatplot##scatter:scatter} is specified, 859 | only one row per field is required; if option {helpb heatplot##hexagon:hexagon} 860 | is specified, 7 or 9 rows are required depending on whether 861 | {helpb heatplot##clip:clip} has been specified). The default 862 | variable names are: 863 | 864 | {cmd:_Z} (aggregated) values of z 865 | {cmd:_Zid} categorized z 866 | {cmd:_Y} y midpoints 867 | {cmd:_Yshape} y shape coordinates 868 | {cmd:_X} x midpoints 869 | {cmd:_Xshape} x shape coordinates 870 | {cmd:_Size} field size (if relevant) 871 | {cmd:_Mlab} marker label (if relevant) 872 | 873 | {pmore} 874 | Alternatively, specify {it:{help namelist}} containing a custom list of 875 | variable names. If the list contains fewer elements than the number of 876 | variables to be generated, the above names are used for the remaining variables. 877 | 878 | {pmore} 879 | If you only want to generate the variables without drawing a graph, 880 | apply the {helpb heatplot##nograph:nograph} option. 881 | 882 | {marker replace}{...} 883 | {phang} 884 | {opt replace} allows overwriting existing variables. 885 | 886 | {marker nopreserve}{...} 887 | {phang} 888 | {opt nopreserve} instructs {cmd:generate()} to replace the original data by 889 | the new variables. The default is to keep the original data. 890 | 891 | 892 | {marker examples}{...} 893 | {title:Examples} 894 | 895 | {pstd} 896 | Also see {browse "http://ideas.repec.org/p/boc/usug19/24.html":Jann (2019b)}. 897 | 898 | {dlgtab:Histograms} 899 | 900 | {pstd} 901 | Bivariate histogram of weight and height: 902 | 903 | . {stata webuse nhanes2, clear} 904 | . {stata heatplot weight height, ylabel(25(25)175)} 905 | 906 | {pstd} 907 | Using first color for background: 908 | 909 | {p 8 12 2} 910 | . {stata heatplot weight height, ylabel(25(25)175, nogrid) backfill colors(magma, reverse)} 911 | 912 | {pstd} 913 | Using hexagons instead of rectangles: 914 | 915 | {p 8 12 2} 916 | . {stata heatplot weight height, ylabel(25(25)175, nogrid) backfill colors(magma, reverse) hexagon} 917 | 918 | {pstd} 919 | Make size of hexagons proportional to the relative frequency and shift 920 | their midpoints to the empirical centers of the included data: 921 | 922 | {p 8 12 2} 923 | . {stata heatplot weight height, ylabel(25(25)175, nogrid) backfill colors(magma, reverse) hexagon sizeprop recenter} 924 | 925 | {pstd} 926 | Report counts instead of percentages and change labels in legend: 927 | 928 | {p 8 12 2} 929 | . {stata heatplot weight height, ylabel(25(25)175, nogrid) backfill colors(magma, reverse) hexagon sizeprop recenter statistic(count) cuts(1(5)96 100) keylabels(, range(1))} 930 | 931 | {dlgtab:Display color ramp instead of legend} 932 | 933 | {pstd} 934 | By default, a legend produced by {helpb graph}'s legend option is 935 | displayed. Alternatively, use the {helpb heatplot##ramp:ramp} option to render the legend as 936 | a color ramp in a separate coordinate system (internally, 937 | {helpb graph combine} will be employed to combine the main plot and the 938 | ramp in a single graph): 939 | 940 | . {stata webuse nhanes2, clear} 941 | . {stata heatplot weight height, ramp} 942 | 943 | {pstd} 944 | Place ramp on right, adjust the space used for the 945 | ramp, specify custom labels: 946 | 947 | {p 8 12 2} 948 | . {stata heatplot weight height, ramp(right space(12) label(0(.1).9))} 949 | 950 | {pstd} 951 | Use text labels and remove title: 952 | 953 | {p 8 12 2} 954 | . {stata heatplot weight height, ramp(right label(@min "low" @max "high") subtitle(""))} 955 | 956 | {pstd} 957 | Assign colors based on a transformed scale and retransform the ramp: 958 | 959 | {p 8 12 2} 960 | . {stata heatplot weight height, stat(count) transform(ln(@)) ramp(transform(exp(@)))} 961 | 962 | {dlgtab:Trivariate distributions} 963 | 964 | {pstd} 965 | The following graph displays the gender distribution (proportion female) by weight and height: 966 | 967 | {p 8 12 2} 968 | . {stata webuse nhanes2, clear} 969 | {p_end} 970 | {p 8 12 2} 971 | . {stata heatplot female weight height, hexagon ylabel(25(25)175) cuts(0(.05)1)} 972 | 973 | {pstd} 974 | The same graph additionally taking into account relative frequencies: 975 | 976 | {p 8 12 2} 977 | . {stata heatplot female weight height, hexagon ylabel(25(25)175) cuts(0(.05)1) sizeprop recenter} 978 | 979 | {pstd} 980 | Distribution of the body mass index by gender and its relation to high blood pressure: 981 | 982 | {p 8 12 2} 983 | . {stata heatplot highbp bmi i.female, xdiscrete(0.9) yline(18.5 25) cuts(0(.05).75) sizeprop recenter colors(inferno) plotregion(color(gs11)) ylabel(, nogrid)} 984 | 985 | {pstd} 986 | Sea surface temperature by longitude, latitude, and date: 987 | 988 | {p 8 12 2} 989 | . {stata sysuse surface, clear} 990 | {p_end} 991 | {p 8 12 2} 992 | . {stata heatplot temperature longitude latitude, bwidth(.5) statistic(asis) by(date, legend(off)) ylabel(30(1)38) aspectratio(1)} 993 | 994 | {pmore} 995 | In this data, {cmd:longitude} and {cmd:latitude} are on a regular grid with a step width of half a degree. This is why 996 | we set the bin width to 0.5 using option {cmd:bwidth(.5)}. Furthermore, for each 997 | combination of {cmd:date}, {cmd:longitude}, and {cmd:latitude} there is only a single {cmd:temperature} measurement. This 998 | is why we can add option {cmd:statistic(asis)}. The option is not strictly needed, it just skips unnecessary 999 | computations. 1000 | 1001 | {pstd} 1002 | Same plot using hexagons: 1003 | 1004 | {p 8 12 2} 1005 | . {stata heatplot temperature longitude latitude, hexagon bwidth(.5) clip statistic(asis) by(date, legend(off)) ylabel(30(1)38) aspectratio(1)} 1006 | 1007 | {pmore} 1008 | Option {cmd:clip} has been specified to clip the hexagons at the outer bounds of the data. 1009 | 1010 | {dlgtab:Correlation matrix} 1011 | 1012 | {pstd} 1013 | Correlation matrix including correlation coefficients as marker labels: 1014 | 1015 | {p 8 12 2} 1016 | . {stata sysuse auto, clear} 1017 | {p_end} 1018 | {p 8 12 2} 1019 | . {stata correlate price mpg trunk weight length turn foreign} 1020 | {p_end} 1021 | {p 8 12 2} 1022 | . {stata matrix C = r(C)} 1023 | {p_end} 1024 | {p 8 12 2} 1025 | . {stata heatplot C, values(format(%9.3f)) color(hcl diverging, intensity(.6)) legend(off) aspectratio(1)} 1026 | 1027 | {pstd} 1028 | Display only lower triangle and omit the diagonal: 1029 | 1030 | {p 8 12 2} 1031 | . {stata heatplot C, values(format(%9.3f)) color(hcl diverging, intensity(.6)) legend(off) aspectratio(1) lower nodiagonal} 1032 | 1033 | {pstd} 1034 | An issue with correlation graph above is that the color gradient is not 1035 | centered at zero. If using a diverging gradient, we would probably want the 1036 | center of the gradient to denote a correlation of zero, and then 1037 | use symmetric intervals on both sides. Use the {helpb heatplot##cuts:cuts()} 1038 | option control how the intervals are constructed. Examples: 1039 | 1040 | {p 8 12 2} 1041 | . {stata heatplot C, color(hcl diverging, intensity(.6)) aspectratio(1) cuts(-1.05(.1)1.05)} 1042 | {p_end} 1043 | {p 8 12 2} 1044 | . {stata heatplot C, color(hcl diverging, intensity(.6)) aspectratio(1) cuts(-1(`=2/15')1) keylabels(, interval)} 1045 | {p_end} 1046 | 1047 | {pstd} 1048 | As seen above, option {cmd:values()} can be used to display the values of the correlations 1049 | with the color fields. It is also possible to print alternative information collected from 1050 | a second matrix using suboption {cmd:label()}. In the following example, p-values are printed: 1051 | 1052 | {p 8 12 2} 1053 | . {stata pwcorr price mpg trunk weight length turn foreign, sig} 1054 | {p_end} 1055 | {p 8 12 2} 1056 | . {stata matrix C = r(C)} 1057 | {p_end} 1058 | {p 8 12 2} 1059 | . {stata matrix sig = r(sig)} 1060 | {p_end} 1061 | {p 8 12 2} 1062 | . {stata heatplot C, values(label(sig) format(%9.3f)) color(hcl diverging, intensity(.6)) legend(off) aspectratio(1)} 1063 | 1064 | {pstd} 1065 | Furthermore, suboption {cmd:transform()} can be used to edit the labels. Here is an 1066 | example that marks non-significant correlations: 1067 | 1068 | {p 8 12 2} 1069 | . {stata heatplot C, values(label(sig) transform(cond(@>.05, "n.s.", ""))) color(hcl diverging, intensity(.6)) legend(off) aspectratio(1)} 1070 | 1071 | 1072 | {dlgtab:Dissimilarity matrix with clusters} 1073 | 1074 | {pstd} 1075 | Illustration of the use of the {helpb heatplot##equations:equations()} option: 1076 | 1077 | . {stata sysuse lifeexp, clear} 1078 | . {stata keep if gnppc<.} 1079 | . {stata cluster wards popgrowth lexp gnppc} 1080 | . {stata cluster generate N = groups(`=_N'), ties(fewer)} 1081 | . {stata cluster generate G = groups(5)} 1082 | . {stata sort G N} 1083 | . {stata matrix dissim D = popgrowth lexp gnppc} 1084 | . {stata `"mata: st_matrixcolstripe("D", strofreal(st_data(., "G N")))"'} 1085 | . {stata `"mata: st_matrixrowstripe("D", strofreal(st_data(., "G N")))"'} 1086 | {p 8 12 2} 1087 | . {stata heatplot D, equations(lcolor(red)) plotregion(margin(zero)) legend(off) aspectratio(1) xscale(alt)} 1088 | 1089 | {dlgtab:Spacial weights matrix} 1090 | 1091 | {pstd} 1092 | Setup: 1093 | 1094 | . {stata "copy http://www.stata-press.com/data/r15/homicide1990.dta ."} 1095 | . {stata "copy http://www.stata-press.com/data/r15/homicide1990_shp.dta ."} 1096 | . {stata use homicide1990} 1097 | . {stata spmatrix create contiguity W} {it:(this may take a while)} 1098 | . {stata spmatrix matafromsp W id = W} 1099 | 1100 | {pstd} 1101 | Heat plot with default settings, ignoring cells (i.e. weights) that are 1102 | equal to zero: 1103 | 1104 | . {stata heatplot mata(W), drop(0) aspectratio(1)} 1105 | 1106 | {pstd} 1107 | Hexagon plot with fine-grained resolution: 1108 | 1109 | . {stata heatplot mata(W), drop(0) aspectratio(1) hexagon bins(100)} 1110 | 1111 | {pstd} 1112 | Plotting each cell individually using the {cmd:discrete} option: 1113 | 1114 | {p 8 12 2} 1115 | . {stata heatplot mata(W), drop(0) aspectratio(1) discrete color(black) p(lalign(center))} 1116 | 1117 | {pmore} 1118 | Since in this matrix all (non-zero) weights have the same value, we only need a single 1119 | color, requested by {cmd:color(black)}. Furthermore, {cmd:p(lalign(center))} has been specified 1120 | to prevent the individual color fields from becoming (almost) invisible. 1121 | 1122 | {pstd} 1123 | A very similar plot can also be produced using the {cmd:scatter} option: 1124 | 1125 | {p 8 12 2} 1126 | . {stata heatplot mata(W), drop(0) aspectratio(1) discrete color(black) scatter p(ms(p))} 1127 | 1128 | 1129 | {title:Returned results} 1130 | 1131 | {p2colset 5 20 20 2}{...} 1132 | {p2col 5 20 24 2: Scalars}{p_end} 1133 | {p2col : {cmd:r(N)}}number of observations 1134 | {p_end} 1135 | {p2col : {cmd:r(levels)}}number of z levels (colors) 1136 | {p_end} 1137 | {p2col : {cmd:r(y_k)}}number of y bins 1138 | {p_end} 1139 | {p2col : {cmd:r(y_wd)}}y bin width 1140 | {p_end} 1141 | {p2col : {cmd:r(y_lb)}}midpoint (or lower bound) of first y bin 1142 | {p_end} 1143 | {p2col : {cmd:r(y_ub)}}midpoint (or upper bound) of last y bin 1144 | {p_end} 1145 | {p2col : {cmd:r(x_k)}}number of x bins 1146 | {p_end} 1147 | {p2col : {cmd:r(x_wd)}}x bin width 1148 | {p_end} 1149 | {p2col : {cmd:r(x_lb)}}midpoint (or lower bound) of first x bin 1150 | {p_end} 1151 | {p2col : {cmd:r(x_ub)}}midpoint (or upper bound) of last x bin 1152 | {p_end} 1153 | 1154 | {p2col 5 20 24 2: Macros}{p_end} 1155 | {p2col : {cmd:r(ztitle)}}legend title 1156 | {p_end} 1157 | {p2col : {cmd:r(ytitle)}}y-axis title 1158 | {p_end} 1159 | {p2col : {cmd:r(xtitle)}}x-axis title 1160 | {p_end} 1161 | {p2col : {cmd:r(colors)}}list of color codes 1162 | {p_end} 1163 | {p2col : {cmd:r(keylabels)}}legend keys 1164 | {p_end} 1165 | {p2col : {cmd:r(eqcoords)}}coordinates of equation outlines 1166 | {p_end} 1167 | 1168 | {p2col 5 20 24 2: Matrices}{p_end} 1169 | {p2col : {cmd:r(cuts)}}cut points used to categorize z 1170 | {p_end} 1171 | 1172 | 1173 | {title:References} 1174 | 1175 | {phang} 1176 | Caceres Bravo, M. (2018). GTOOLS: Stata module to provide a fast 1177 | implementation of common group commands. Available from 1178 | {browse "http://ideas.repec.org/c/boc/bocode/s458514.html"} (also see 1179 | {browse "http://github.com/mcaceresb/stata-gtools"}). 1180 | {p_end} 1181 | {phang} 1182 | Jann, B. (2018). {browse "https://www.stata-journal.com/article.html?article=gr0075":Color palettes for Stata graphics}. The Stata Journal 1183 | 18(4): 765-785. 1184 | {p_end} 1185 | {phang} 1186 | Jann, B. (2019a). ColrSpace: Mata class for color management. Available from 1187 | {browse "http://ideas.repec.org/c/boc/bocode/s458597.html"}. 1188 | {p_end} 1189 | {phang} 1190 | Jann, B. (2019b). Heat (and hexagon) plots in Stata. Presentation at London 1191 | Stata Conference 2019. Available from {browse "http://ideas.repec.org/p/boc/usug19/24.html"}. 1192 | {p_end} 1193 | 1194 | {title:Author} 1195 | 1196 | {pstd} 1197 | Ben Jann, University of Bern, ben.jann@unibe.ch 1198 | 1199 | {pstd} 1200 | Thanks for citing this software as follows: 1201 | 1202 | {pmore} 1203 | Jann, B. (2019). heatplot: Stata module to create heat plots and hexagon plots. Available from 1204 | {browse "http://ideas.repec.org/c/boc/bocode/s458598.html"}. 1205 | 1206 | 1207 | {title:Also see} 1208 | 1209 | {psee} 1210 | Online: help for {helpb hexplot}, {helpb colorpalette}, 1211 | {helpb twoway contour} 1212 | 1213 | -------------------------------------------------------------------------------- /hexplot.ado: -------------------------------------------------------------------------------- 1 | *! version 1.0.1 19jul2021 Ben Jann 2 | 3 | program hexplot 4 | version 13 5 | _parse comma spec 0 : 0 6 | syntax [, VERTical HORizontal odd even left right /// 7 | HEXagon HEXagon2(passthru) scatter scatter2(passthru) /// 8 | BCuts(passthru) XBCuts(passthru) YBCuts(passthru) * ] 9 | if `"`hexagon'`hexagon2'"'!="" { 10 | di as err "hexagon() not allowed" 11 | exit 198 12 | } 13 | if `"`scatter'`scatter2'"'!="" { 14 | di as err "scatter() not allowed" 15 | exit 198 16 | } 17 | if `"`bcuts'"'!="" { 18 | di as err "bcuts() not allowed" 19 | exit 198 20 | } 21 | if `"`xbcuts'"'!="" { 22 | di as err "xbcuts() not allowed" 23 | exit 198 24 | } 25 | if `"`ybcuts'"'!="" { 26 | di as err "ybcuts() not allowed" 27 | exit 198 28 | } 29 | heatplot `spec', hexagon /// 30 | hexagon(`vertical' `horizontal' `right' `left' `even' `odd') /// 31 | `options' 32 | end 33 | 34 | -------------------------------------------------------------------------------- /hexplot.sthlp: -------------------------------------------------------------------------------- 1 | {smcl} 2 | {* 20jul2021}{...} 3 | {hi:help hexplot}{right:{browse "http://github.com/benjann/heatplot/"}} 4 | {hline} 5 | 6 | {title:Title} 7 | 8 | {pstd}{hi:hexplot} {hline 2} Command to create hexagon plots 9 | 10 | 11 | {title:Syntax} 12 | 13 | {pstd} 14 | Syntax 1: Hex plot from variables 15 | 16 | {p 8 15 2} 17 | {cmd:hexplot} [{it:z}] {it:y} {it:x} {ifin} {weight} 18 | [{cmd:,} 19 | {help hexplot##opts:{it:options}} 20 | ] 21 | 22 | {pmore} 23 | where {it:z} is a numeric variable (assumed constant if omitted), {it:y} 24 | is a numeric variable or a string variable, and {it:x} is a numeric variable 25 | or a string variable. Categorical {it:y} and {it:x} variables can be specified 26 | as {cmd:i.}{it:varname}. 27 | 28 | {pstd} 29 | Syntax 2: Hex plot from Mata matrix 30 | 31 | {p 8 15 2} 32 | {cmd:hexplot} {opt m:ata(name)} 33 | [{cmd:,} 34 | {help hexplot##opts:{it:options}} 35 | ] 36 | 37 | {pmore} 38 | where {it:name} is a numeric {help mata:Mata matrix} (contents = {it:z}, row index = {it:y}, 39 | column index = {it:x}). 40 | 41 | {pstd} 42 | Syntax 3: Hex plot from Stata matrix 43 | 44 | {p 8 15 2} 45 | {cmd:hexplot} {it:matname} [{cmd:,} 46 | {help hexplot##opts:{it:options}} 47 | ] 48 | 49 | {pmore} 50 | where {it:matname} is a {help matrix:Stata matrix} 51 | (contents = {it:z}, row names = {it:y}, column names = {it:x}). 52 | 53 | 54 | {marker opts}{...} 55 | {synoptset 22}{...} 56 | {synopthdr:options} 57 | {synoptline} 58 | {synopt :{opt hor:izontal}}arrange hexagons horizontally 59 | {p_end} 60 | {synopt :{opt left}}start with a left shift 61 | {p_end} 62 | {synopt :{opt odd}}use odd number of columns 63 | {p_end} 64 | {synopt :{helpb heatplot##heatopts:{it:heatplot_options}}}Syntax 1, Syntax 2, or 65 | Syntax 3 options of {helpb heatplot} 66 | {p_end} 67 | {synoptline} 68 | 69 | {pstd} 70 | {cmd:fweight}s, {cmd:aweight}s, {cmd:iweight}s, and {cmd:pweight}s are allowed with Syntax 1; see help {help weight}. 71 | 72 | 73 | {title:Description} 74 | 75 | {pstd} 76 | {cmd:hexplot} creates hexagon plots. It is implemented as a wrapper for 77 | {helpb heatplot}. {cmd:hexplot} is equivalent to {cmd:heatplot} with option 78 | {cmd:hexagon}. 79 | 80 | 81 | {title:Options} 82 | 83 | {phang} 84 | {opt horizontal} arranges the hexagons horizontally. The default is to arrange 85 | the hexagons vertically. 86 | 87 | {phang} 88 | {opt left} starts with a left-shifted hexagon row. The default is to start 89 | with a right-shifted row. If {cmd:horizontal} is specified, {cmd:left} 90 | starts with an down-shifted row instead of an up-shifted row. 91 | 92 | {phang} 93 | {opt odd} uses an odd number of hexagon columns. The default is to use 94 | an even number of columns. That is, by default the bins on the x-axis are 95 | constructed in a way such that each bin contains a double column of hexagons, 96 | yielding an even overall number of columns. Specify {cmd:odd} to construct the 97 | bins in a way such that the last bin only contains a single column. If 98 | {cmd:horizontal} is specified, {cmd:odd} affects the number of 99 | hexagon rows rather than columns. 100 | 101 | {phang} 102 | {it:heatplot_options} are {helpb heatplot} options allowed in Syntax 1, 2, or 103 | 3, respectively. Not allowed are options {cmd:scatter()}, {cmd:hexagon()}, 104 | {cmd:bcuts()}, {cmd:ybcuts()}, and {cmd:xbcuts()}. 105 | 106 | {title:Examples} 107 | 108 | . {stata drawnorm y x, n(10000) corr(1 .5 1) cstorage(lower) clear} 109 | . {stata hexplot y x} 110 | . {stata hexplot y x, horizontal} 111 | . {stata hexplot y x, size recenter} 112 | {p 4 8 2} 113 | . {stata hexplot y x, statistic(count) cuts(@min(5)@max) colors(dimgray black) keylabels(, range(1))} 114 | 115 | . {stata sysuse auto, clear} 116 | {p 4 8 2} 117 | . {stata hexplot price weight mpg, colors(plasma, intensity(.6)) p(lc(black) lalign(center)) legend(off) values(format(%9.0f)) aspectratio(1)} 118 | 119 | 120 | {title:Author} 121 | 122 | {pstd} 123 | Ben Jann, University of Bern, ben.jann@unibe.ch 124 | 125 | {pstd} 126 | Thanks for citing this software as follows: 127 | 128 | {pmore} 129 | Jann, B. (2019). heatplot: Stata module to create heat plots and hexagon plots. Available from 130 | {browse "http://ideas.repec.org/c/boc/bocode/s458595.html"}. 131 | 132 | 133 | {title:Also see} 134 | 135 | {psee} 136 | Online: help for {helpb heatplot}, {helpb colorpalette}, 137 | {helpb twoway contour} 138 | -------------------------------------------------------------------------------- /stata.toc: -------------------------------------------------------------------------------- 1 | v 3 2 | p heatplot module to create heat plots and hexagon plots 3 | --------------------------------------------------------------------------------