├── .gitignore ├── CRAPL-LICENSE.txt ├── README.txt ├── examples ├── memsim.py ├── parse.perl ├── parse_results.py ├── runfair ├── runsim ├── runstride ├── src ├── Makefile ├── configfile.h ├── main.c ├── memory_controller.c ├── memory_controller.h ├── params.h ├── processor.h ├── scheduler-close.c ├── scheduler-close.h ├── scheduler-fair.c ├── scheduler-fair.h ├── scheduler-fcfs.c ├── scheduler-fcfs.h ├── scheduler-frfcfs.c ├── scheduler-orcw.c ├── scheduler-perf.c ├── scheduler-pwrdn.c ├── scheduler-pwrdn.h ├── scheduler-stride.c ├── scheduler-test.c ├── scheduler.c ├── scheduler.h ├── utils.h └── utlist.h ├── usimm-script.pl └── usimm.pdf /.gitignore: -------------------------------------------------------------------------------- 1 | input 2 | obj 3 | output* 4 | GPATH 5 | GRTAGS 6 | GTAGS 7 | GSYMS 8 | bin 9 | -------------------------------------------------------------------------------- /CRAPL-LICENSE.txt: -------------------------------------------------------------------------------- 1 | 2 | THE CRAPL v0 BETA 1 3 | 4 | 5 | 0. Information about the CRAPL 6 | 7 | If you have questions or concerns about the CRAPL, or you need more 8 | information about this license, please contact: 9 | 10 | Matthew Might 11 | http://matt.might.net/ 12 | 13 | 14 | I. Preamble 15 | 16 | Science thrives on openness. 17 | 18 | In modern science, it is often infeasible to replicate claims without 19 | access to the software underlying those claims. 20 | 21 | Let's all be honest: when scientists write code, aesthetics and 22 | software engineering principles take a back seat to having running, 23 | working code before a deadline. 24 | 25 | So, let's release the ugly. And, let's be proud of that. 26 | 27 | 28 | II. Definitions 29 | 30 | 1. "This License" refers to version 0 beta 0 of the Community 31 | Research and Academic Programming License (the CRAPL). 32 | 33 | 2. "The Program" refers to the medley of source code, shell scripts, 34 | executables, objects, libraries and build files supplied to You, 35 | or these files as modified by You. 36 | 37 | [Any appearance of design in the Program is purely coincidental and 38 | should not in any way be mistaken for evidence of thoughtful 39 | software construction.] 40 | 41 | 3. "You" refers to the person or persons brave and daft enough to use 42 | the Program. 43 | 44 | 4. "The Documentation" refers to the Program. 45 | 46 | 5. "The Author" probably refers to the caffeine-addled graduate 47 | student that got the Program to work moments before a submission 48 | deadline. 49 | 50 | 51 | III. Terms 52 | 53 | 1. By reading this sentence, You have agreed to the terms and 54 | conditions of this License. 55 | 56 | 2. If the Program shows any evidence of having been properly tested 57 | or verfied, You will disregard this evidence. 58 | 59 | 3. You agree to hold the Author free from shame, embarrassment or 60 | ridicule for any hacks, kludges or leaps of faith found within the 61 | Program. 62 | 63 | 4. You recognize that any request for support for the Program will be 64 | discarded with extreme prejudice. 65 | 66 | 5. The Author reserves all rights to the Program, except for any 67 | rights granted under any additional licenses attached to the 68 | Program. 69 | 70 | 71 | IV. Permissions 72 | 73 | 1. You are permitted to use the Program to validate published 74 | scientific claims. 75 | 76 | 77 | V. Disclaimer of Warranty 78 | 79 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY 80 | APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT 81 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT 82 | WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT 83 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 84 | A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND 85 | PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE 86 | DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR 87 | CORRECTION. 88 | 89 | 90 | VI. Limitation of Liability 91 | 92 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 93 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR 94 | CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, 95 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES 96 | ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT 97 | NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR 98 | LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM 99 | TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER 100 | PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 101 | 102 | -------------------------------------------------------------------------------- /README.txt: -------------------------------------------------------------------------------- 1 | 2 | ------------------------------------------------------------------------------ 3 | USIMM: the Utah SImulated Memory Module 4 | Version 1.3 5 | 6 | USIMM is distributed under the CRAPL (see CRAPL-LICENSE.txt 7 | in this directory). In spite of the tongue-in-cheek terms of 8 | the license, we will be supporting the USIMM infrastructure. 9 | The src/utlist.h file was obtained from http://uthash.sourceforge.net 10 | and is subject to the terms of the BSD license. 11 | For questions to the USIMM developers, email usimm@cs.utah.edu 12 | For updates or discussions, email usimm-users@cs.utah.edu, 13 | or visit this blog post: http://utaharch.blogspot.com/2012/02/usimm.html 14 | 15 | USIMM was developed by members of the Utah Arch group: 16 | Niladrish Chatterjee, Rajeev Balasubramonian, Manjunath Shevgoor, 17 | Seth H. Pugsley, Aniruddha N. Udipi, Ali Shafiee, Kshitij Sudan, Manu Awasthi. 18 | 19 | We also received traces and suggestions from other JWAC MSC organizers: 20 | Zeshan Chishti (Intel), Alaa R. Alameldeen (Intel), Eric Rotenberg (NC State) 21 | 22 | Code download: http://www.cs.utah.edu/~rajeev/usimm-v1.3.tar.gz 23 | 24 | USIMM Tech Report: http://www.cs.utah.edu/~rajeev/pubs/usimm.pdf 25 | 26 | The JWAC MSC website: http://www.cs.utah.edu/~rajeev/jwac12/ 27 | ------------------------------------------------------------------------------ 28 | 29 | 30 | GETTING STARTED 31 | --------------- 32 | 33 | If you've reached this far, you've already been able to unzip and 34 | untar the distribution with: 35 | gunzip usimm-v1.3.tar.gz 36 | tar xvf usimm-v1.3.tar 37 | 38 | The root directory has the following directories and files: 39 | src/ : Code source files 40 | bin/ : Houses the usimm executable 41 | obj/ : Houses the intermediate object files for the source files 42 | input/ : Has the (simulated) system configuration files and input traces 43 | output/ : Can store the simulation outputs 44 | runsim : A script to execute a few example simulations 45 | README.txt: this file! 46 | usimm.pdf : The USIMM tech report 47 | CRAPL-LICENSE.txt: The CRAPL license. 48 | 49 | To get started, 50 | cd src/ 51 | make clean 52 | make 53 | 54 | This produces a usimm executable in the bin/ directory. To run the 55 | example simulation script, 56 | cd .. 57 | ./runsim 58 | 59 | The simulation should finish in tens of minutes. Use a truncated version of 60 | the trace files for shorter tests. To examine the simulation outputs, 61 | view output/* 62 | 63 | The input/ directory contains the system and DRAM chip configuration 64 | files that are read by USIMM. Do not rename this directory or 65 | the DRAM chip configuration files. The input/ directory also 66 | contains 13 trace files for 10 different benchmarks. Please see 67 | Appendix C of the USIMM Tech report for details on these benchmarks. 68 | 69 | CODE ORGANIZATION 70 | ----------------- 71 | 72 | The src/ directory has the following files: 73 | 74 | main.c : Handles the main program loop that retires instructions, 75 | fetches new instructions from the input traces, and calls update_memory(). 76 | 77 | memory_controller.c : Implements update_memory(), a function that checks 78 | DRAM timing parameters to determine which commands can issue in this cycle. 79 | 80 | scheduler.c : Function provided by the user to select a command for each 81 | channel in every memory cycle. The provided default is a simple FCFS 82 | algorithm with periodic write drains. 83 | 84 | scheduler.h : Header file for the user's scheduler function. 85 | 86 | configfile.h : Header file to enable reading input system config files. 87 | 88 | memory_controller.h : Header file to enable DRAM timing management. 89 | 90 | params.h : Header file for all system parameters. 91 | 92 | processor.h : Header file for the ROB structure that controls the processor. 93 | 94 | utils.h : A few utility functions. 95 | 96 | utlist.h : Utility functions to manage linked lists. 97 | 98 | 99 | SAMPLE SCHEDULERS 100 | ----------------- 101 | 102 | The src/ directory also includes the following example simple schedulers: 103 | 104 | scheduler-fcfs.c/h : Basic FCFS, plus a periodic write drain mechanism. 105 | 106 | scheduler-close.c/h : Precharges banks during idle cycles soon after a column rd/wr. 107 | 108 | 109 | -------------------------------------------------------------------------------- /examples: -------------------------------------------------------------------------------- 1 | 2 | bin/usimm input/1channel.cfg input/comm2 > output/c2-1 & 3 | bin/usimm input/1channel.cfg input/comm1 input/comm1 > output/c1-c1-1 & 4 | bin/usimm input/1channel.cfg input/comm1 input/comm1 input/comm2 input/comm2 > output/c1-c1-c2-c2-1 & 5 | bin/usimm input/1channel.cfg input/MT0-canneal input/MT1-canneal input/MT2-canneal input/MT3-canneal > output/MTc-1 & 6 | bin/usimm input/1channel.cfg input/fluid input/swapt input/comm2 input/comm2 > output/fl-sw-c2-c2-1 & 7 | bin/usimm input/1channel.cfg input/face input/face input/ferret input/ferret > output/fa-fa-fe-fe-1 & 8 | bin/usimm input/1channel.cfg input/black input/black input/freq input/freq > output/bl-bl-fr-fr-1 & 9 | bin/usimm input/1channel.cfg input/stream input/stream input/stream input/stream > output/st-st-st-st-1 & 10 | 11 | bin/usimm input/4channel.cfg input/comm2 > output/c2-4 & 12 | bin/usimm input/4channel.cfg input/comm1 input/comm1 > output/c1-c1-4 & 13 | bin/usimm input/4channel.cfg input/comm1 input/comm1 input/comm2 input/comm2 > output/c1-c1-c2-c2-4 & 14 | bin/usimm input/4channel.cfg input/MT0-canneal input/MT1-canneal input/MT2-canneal input/MT3-canneal > output/MTc-4 & 15 | bin/usimm input/4channel.cfg input/fluid input/swapt input/comm2 input/comm2 > output/fl-sw-c2-c2-4 & 16 | bin/usimm input/4channel.cfg input/face input/face input/ferret input/ferret > output/fa-fa-fe-fe-4 & 17 | bin/usimm input/4channel.cfg input/black input/black input/freq input/freq > output/bl-bl-fr-fr-4 & 18 | bin/usimm input/4channel.cfg input/stream input/stream input/stream input/stream > output/st-st-st-st-4 & 19 | bin/usimm input/4channel.cfg input/fluid input/fluid input/swapt input/swapt input/comm2 input/comm2 input/ferret input/ferret > output/fl-fl-sw-sw-c2-c2-fe-fe-4 & 20 | bin/usimm input/4channel.cfg input/fluid input/fluid input/swapt input/swapt input/comm2 input/comm2 input/ferret input/ferret input/black input/black input/freq input/freq input/comm1 input/comm1 input/stream input/stream > output/fl-fl-sw-sw-c2-c2-fe-fe-bl-bl-fr-fr-c1-c1-st-st-4 & 21 | 22 | -------------------------------------------------------------------------------- /memsim.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | import os 3 | import shutil 4 | import sys 5 | 6 | SIM_DIR = '/home/pranith/ece8873/assignment3/usimm-v1.3' 7 | RESULTS_DIR = '/home/pranith/ece8873/assignment3/results' 8 | TRACE_DIR = '/home/pranith/ece8873/assignment3/Traces' 9 | TRACE_FILES = {'A': 'Trace_A', 'B': 'Trace_B', 'C': 'Trace_C', 'D': 'Trace_D', 'E': 'Trace_E'} 10 | BENCHMARKS = ['AAAA', 'BBBB', 'CCCC', 'DDDD', 'EEEE', 'ABCD', 'ABCE', 'ABDE', 'ACDE', 'BCDE'] 11 | BIN = 'bin/usimm' 12 | SIM = 'usimm' 13 | CONFIG_FILE = 'input/4channel.cfg' 14 | CONFIG = '4channel.cfg' 15 | RUN_FILE = 'run.py' 16 | 17 | def run_bench(run_name, bench, base_dir): 18 | file_name = os.path.join(base_dir, RUN_FILE) 19 | with open(file_name, 'w') as file: 20 | file.write('#!/usr/bin/python\n\n') 21 | file.write('import os\n') 22 | file.write('import glob\n') 23 | file.write('import sys\n\n') 24 | file.write('ppid = os.getppid()\n') 25 | file.write('test_dir = \'/tmp/memsim_\' + \'%s_\' + str(ppid) + \'/%s\'\n' % (run_name, bench)) 26 | file.write('os.chdir(\'%s\')\n' % (base_dir)) 27 | file.write('os.system(\'uname -a\')\n') 28 | file.write('os.system(\'mkdir -p \%s\' % (test_dir))\n') 29 | file.write('os.system(\'mkdir -p \%s/input\' % (test_dir))\n') 30 | file.write('os.system(\'cp ../%s %%s\' %% (test_dir))\n' % SIM) 31 | file.write('os.system(\'cp ../%s %%s\' %% (test_dir))\n' % CONFIG) 32 | file.write('os.system(\'cp ../*.vi %s/input/\' % (test_dir))\n') 33 | file.write('os.chdir(\'%s\' % (test_dir))\n') 34 | 35 | trace_cmd = '' 36 | results_file = '' 37 | for trace_name in bench: 38 | trace_loc = os.path.join(TRACE_DIR, TRACE_FILES[trace_name]) 39 | trace_cmd += trace_loc + ' ' 40 | results_file += 'c' + trace_name + '-' 41 | 42 | results_file += '4_' + run_name 43 | 44 | 45 | file.write('os.system(\'./%s %s %s > %s\')\n' % (SIM, CONFIG, trace_cmd, results_file)) 46 | file.write('os.system(\'mv %s %s\')\n' % (results_file, os.path.join(base_dir, results_file))) 47 | file.write('os.system(\'cp %s %s\')\n' % (os.path.join(base_dir, results_file), os.path.join(SIM_DIR, 'output', results_file))) 48 | 49 | file.write('os.system(\'rm -rf %s\' % (test_dir))\n') 50 | file.close() 51 | 52 | os.system('chmod +x %s' % (file_name)) 53 | 54 | # qsub command 55 | cmd = [] 56 | cmd += ['qsub'] 57 | cmd += ['run.py'] 58 | cmd += ['-V -m n'] 59 | cmd += ['-o', '%s/qsub.stdout' % (base_dir)] 60 | cmd += ['-e', '%s/qsub.stderr' % (base_dir)] 61 | cmd += ['-q', 'pool1'] 62 | cmd += ['-N', '%s_%s' % (run_name, bench)] 63 | cmd += ['-l', 'nodes=1:ppn=1'] 64 | 65 | cwd = os.getcwd() 66 | os.chdir('%s' % (base_dir)) 67 | os.system('/bin/echo \'%s\' > %s/RUN_CMD' % (' '.join(cmd), base_dir)) 68 | os.system('%s | tee %s/JOB_ID' % (' '.join(cmd), base_dir)) 69 | os.chdir(cwd) 70 | 71 | 72 | def memsim(run_name): 73 | cur_dir = os.getcwd() 74 | run_dir = os.path.join(RESULTS_DIR, run_name) 75 | bin_name = BIN + '-' + run_name; 76 | 77 | if os.path.exists(run_dir): 78 | print("Erasing current directory..."); 79 | shutil.rmtree(run_dir) 80 | 81 | os.makedirs(run_dir) 82 | shutil.copyfile(os.path.join(cur_dir, bin_name), os.path.join(run_dir, SIM)) 83 | shutil.copystat(os.path.join(cur_dir, bin_name), os.path.join(run_dir, SIM)) 84 | os.system('cp %s/input/*.vi %s' % (SIM_DIR, run_dir)) 85 | shutil.copyfile(os.path.join(cur_dir, CONFIG_FILE), os.path.join(run_dir, CONFIG)) 86 | 87 | for bench in BENCHMARKS: 88 | bench_dir = os.path.join(run_dir, bench) 89 | os.mkdir(bench_dir) 90 | run_bench(run_name, bench, bench_dir) 91 | 92 | if __name__ == "__main__": 93 | run_name = 'base' 94 | if len(sys.argv) == 2: 95 | run_name = sys.argv[1] 96 | memsim(run_name) 97 | -------------------------------------------------------------------------------- /parse.perl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl -w 2 | 3 | #----------------------------------------------------------- 4 | # Description: 5 | # ------------ 6 | # This script parses the files mentioned in the runsim script 7 | # and creates a csv with all the relevant information of all 8 | # the simulations 9 | # 10 | # Inputs: 11 | # ------- 12 | # 1) gen_csv.pl takes the name of the USIMM runscript as the only argument 13 | # 2) The outputs produced by the runscript should be in the output/ 14 | # directory 15 | # 3) The output files must follow the following naming convention 16 | # -...-<>-_ 17 | # 4) Usage: > cd 18 | # > ./gen_csv.pl -runscript 19 | # 20 | # Outputs: 21 | # -------- 22 | # output/stats.csv 23 | # Final Metric numbers are also printed on Stdout 24 | # 25 | # Other Notes: 26 | # ------------ 27 | # 1) Make sure all your single thread bcmks have "$single_thread_time" 28 | # (The times in this gen_csv.pl script represent single thread behavior 29 | # with an FCFS scheduler, and will be the ones used for the MSC) 30 | # 2) $mt includes the programs excluded from the fairness calculations 31 | 32 | #----------------------------------------------------------- 33 | use strict; 34 | 35 | my %single_thread_time; 36 | 37 | # Benchmark Naming convention 38 | # $single_thread_time{} 39 | 40 | $single_thread_time{bl1}=318150748; 41 | $single_thread_time{bo1}=293623201; 42 | $single_thread_time{ca1}=465074385; 43 | $single_thread_time{fa1}=404645160; 44 | $single_thread_time{fe1}=379065129; 45 | $single_thread_time{fr1}=305902869; 46 | $single_thread_time{ra1}=319983309; 47 | $single_thread_time{st1}=320441340; 48 | $single_thread_time{vi1}=325420205; 49 | $single_thread_time{x21}=332000385; 50 | $single_thread_time{c11}=372897100; 51 | $single_thread_time{c21}=442948245; 52 | $single_thread_time{fl1}=468052997; 53 | $single_thread_time{sw1}=474243253; 54 | 55 | 56 | $single_thread_time{c31}=354524613; 57 | $single_thread_time{c41}=316208876; 58 | $single_thread_time{c51}=265660108; 59 | $single_thread_time{fd1}=464442704; 60 | $single_thread_time{fp1}=251305652; 61 | $single_thread_time{ge1}=389291761; 62 | $single_thread_time{hm1}=296911657; 63 | $single_thread_time{lb1}=608302501; 64 | $single_thread_time{le1}=395169481; 65 | $single_thread_time{li1}=611499664; 66 | $single_thread_time{mc1}=614688361; 67 | $single_thread_time{mu1}=628855245; 68 | $single_thread_time{s21}=351731072; 69 | $single_thread_time{ti1}=672025103; 70 | 71 | 72 | 73 | $single_thread_time{bl4}=187992840; 74 | $single_thread_time{bo4}=167672553; 75 | $single_thread_time{ca4}=300787617; 76 | $single_thread_time{fa4}=210337888; 77 | $single_thread_time{fe4}=232711401; 78 | $single_thread_time{fr4}=174316754; 79 | $single_thread_time{ra4}=186816805; 80 | $single_thread_time{st4}=188074168; 81 | $single_thread_time{vi4}=192811301; 82 | $single_thread_time{x24}=197657337; 83 | $single_thread_time{c14}=244419708; 84 | $single_thread_time{c24}=303069945; 85 | $single_thread_time{fl4}=275488665; 86 | $single_thread_time{sw4}=276348929; 87 | 88 | 89 | $single_thread_time{c34}=218593453; 90 | $single_thread_time{c44}=190803829; 91 | $single_thread_time{c54}=154061128; 92 | $single_thread_time{fd4}=267090041; 93 | $single_thread_time{fp4}=125928705; 94 | $single_thread_time{ge4}=225046893; 95 | $single_thread_time{hm4}=217980413; 96 | $single_thread_time{lb4}=319465145; 97 | $single_thread_time{le4}=390428605; 98 | $single_thread_time{li4}=364135428; 99 | $single_thread_time{mc4}=383824086; 100 | $single_thread_time{mu4}=443990865; 101 | $single_thread_time{s24}=195015969; 102 | $single_thread_time{ti4}=462633387; 103 | 104 | 105 | 106 | 107 | # Multi threaded benchmarks and single programmed workloads are being excluded 108 | # from the fairness calculations. 109 | # Slowdown cannot be calculated for these 110 | my %mt; 111 | $mt{MTc} =1; 112 | $mt{c2} =1; 113 | $mt{MTf} =1; 114 | #$mt{MTf4} =1; 115 | 116 | #----------------------------------------------------------- 117 | #Get Options 118 | use Getopt::Long; 119 | my ($ret, $help); 120 | my $runscript; 121 | 122 | $ret = Getopt::Long::GetOptions ( 123 | 124 | "runscript|runsim:s" => \$runscript, 125 | "help|h:s" => \$help 126 | ) ; 127 | 128 | 129 | 130 | if( !(defined $runscript)) { 131 | print STDERR "Warning: USIMM runscript not specified. Using the default ./runsim\n"; 132 | $runscript= "runsim"; 133 | print STDERR "Usage: $0 -runscript \n\n"; 134 | } 135 | 136 | if (defined $help) { 137 | print STDERR "Usage: $0 -runscript \n\n"; 138 | exit; 139 | } 140 | #----------------------------------------------------------- 141 | # Check if the script is being run in the correct directory 142 | if (! -d "./output") { 143 | die ("ERROR: ./output does not exis.. Exiting"); 144 | } 145 | 146 | # Check if the runcript exists in the location specified by the user 147 | if (! -e "$runscript") { 148 | die ("ERROR: $runscript does not exis.. Exiting"); 149 | } 150 | 151 | #----------------------------------------------------------- 152 | # Parse the runscript to get the names of all the output files 153 | # The runscript is assumed to be a shell script where the output 154 | # is dumped using the ">" operatior 155 | # All files generated by the ">" operator in the runscript will be 156 | # assumed to be valid usimm outputs 157 | 158 | open (RUNSIM_FH, "<$runscript") or die "Error: Can't open $runscript\n"; 159 | 160 | # This variable keeps track of the max number of programs in each workload 161 | # This is needed to align the columns in the .csv 162 | my $max_progs=0; 163 | 164 | my $line; 165 | 166 | #This variable will have all the filenames that will parsed as valid USIMM 167 | #outputs 168 | my @outputs; 169 | 170 | while () { 171 | chomp; 172 | $line=$_; 173 | if ($line=~/>/) { 174 | chomp; 175 | $line=~s/.*>\s*//; 176 | $line=~s/.*output.*\///; 177 | $line=~s/.*\///; 178 | $line=~s/&//; 179 | $line=~s/\s+.*//; 180 | $line=~s/"//g; 181 | @outputs[scalar @outputs] = $line; 182 | 183 | my @benchmarks= split (/-/,$line); 184 | my $file_len = scalar @benchmarks -1; 185 | $max_progs= $file_len if ($max_progs < $file_len); 186 | } 187 | } 188 | close RUNSIM_FH; 189 | 190 | #----------------------------------------------------------- 191 | 192 | my $progs=0; 193 | my $sum_time=0; 194 | 195 | 196 | #my @outputs = ( "c2-1", "c1-c1-1", "bl-bl-fr-fr-1", "c2-4", "st-st-st-st-1", "c1-c1-4", "fa-fa-fe-fe-1", "c1-c1-c2-c2-1", "bl-bl-fr-fr-4", "st-st-st-st-4", "c1-c1-c2-c2-4", "fa-fa-fe-fe-4", "MTc-1", "MTc-4", "fl-sw-c2-c2-1", "fl-sw-c2-c2-4", "fl-fl-sw-sw-c2-c2-fe-fe-4", "fl-fl-sw-sw-c2-c2-fe-fe-bl-bl-fr-fr-c1-c1-st-st-4" ); 197 | 198 | 199 | # This is the file where the output will be writted to in the output directory 200 | my $out_file = "stats.csv"; 201 | 202 | print "INFO: Writing output to output/$out_file\n"; 203 | open (OUT_FH, ">output/$out_file") or die "Error: Can't open output/$out_file\n"; 204 | 205 | print OUT_FH "Workload, Sum of Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,,Max SlowDown, MinSlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown...\n"; 206 | 207 | my $infile; 208 | my %bcmk; 209 | my %time; 210 | my %slowdown; 211 | my %max_slowdown; 212 | my %min_slowdown; 213 | my $total_execution_time=0; 214 | my $pfp_total_execution_time=0; 215 | my %edp; 216 | my $total_edp=0; 217 | my $i; 218 | 219 | # Start reading USIMM output files 220 | foreach $infile (@outputs) { 221 | print "INFO: Openeing file output/$infile "; 222 | open (IN_FH, ") { 235 | chomp; 236 | $line= $_; 237 | 238 | 239 | 240 | if ($line =~/^Done: Core/) { 241 | #time {inFile} {core} 242 | $time{$infile}{$n} = $line; 243 | $time{$infile}{$n} =~s/.* //; 244 | 245 | 246 | $total_execution_time += $time{$infile}{$n}; 247 | $pfp_total_execution_time += $time{$infile}{$n} if (!exists $mt{$file_name}); 248 | 249 | $bcmk{$infile}{$n} = $bcmks[$n]; 250 | 251 | # If filename exists in the $mt hash, then it is multi-threaded and will not have a 252 | # single threaded exec time 253 | if (!exists $mt{$file_name}) { 254 | my $single_thread_key= "$bcmks[$n]"."$current_channel"; 255 | if (exists $single_thread_time{$single_thread_key}) { 256 | $slowdown{$infile}{$n} = $time{$infile}{$n}/$single_thread_time{$single_thread_key}; 257 | } else { 258 | die "ERROR: single_thread_time{$single_thread_key} does not exist n=$n file_name=$file_name ... Exiting"; 259 | } 260 | } else { 261 | # SLowdown is marked negative to mark invalid 262 | # This condition is checked each time $slowdown is used 263 | $slowdown{$infile}{$n} = -1; 264 | } 265 | 266 | # Find max slowdown 267 | if (! exists $max_slowdown{$infile}) { 268 | if ($slowdown{$infile}{$n} > 0) { 269 | $max_slowdown{$infile} = $slowdown{$infile}{$n}; 270 | } else { 271 | $max_slowdown{$infile} = 0; 272 | } 273 | } elsif ($max_slowdown{$infile} < $slowdown{$infile}{$n} ){ 274 | $max_slowdown{$infile} = $slowdown{$infile}{$n}; 275 | } 276 | 277 | if (! exists $min_slowdown{$infile}) { 278 | if ($slowdown{$infile}{$n} > 0) { 279 | $min_slowdown{$infile} = $slowdown{$infile}{$n}; 280 | } else { 281 | $min_slowdown{$infile} = 0; 282 | } 283 | } elsif ($min_slowdown{$infile} > $slowdown{$infile}{$n} ){ 284 | $min_slowdown{$infile} = $slowdown{$infile}{$n}; 285 | } 286 | 287 | $n++; 288 | 289 | } elsif ($line =~/Energy Delay product \(EDP\) = (\S+) J.s/) { 290 | $edp{$infile} = $1; 291 | $total_edp += $1; 292 | } 293 | } 294 | 295 | #print " - MAX SLOWDOWN: $max_slowdown{$infile} , MIN SLOWDOWN: $min_slowdown{$infile}\n"; 296 | 297 | close IN_FH; 298 | } 299 | 300 | 301 | 302 | my $n=0; 303 | my $total_max_slowdown=0; 304 | my $l_slowdown=0; 305 | my $avg_max_slowdown; 306 | 307 | my $workload; 308 | my $core; 309 | my $num_non_mt_workload=0; 310 | 311 | 312 | $n=0; 313 | # The completion time of MultiThreaded workloads is not considered while 314 | # calculating the PFP metrics 315 | 316 | foreach $workload (sort keys %slowdown) { 317 | $n++; 318 | my $threads= $workload; 319 | 320 | printf "THREADS= $threads\n"; 321 | 322 | 323 | $threads=~s/-\d+$//; 324 | if ( ! exists $mt{$threads}) { 325 | $num_non_mt_workload++ ; 326 | } 327 | $total_max_slowdown+=$max_slowdown{$workload} ; 328 | } 329 | $avg_max_slowdown = $total_max_slowdown/$n; 330 | my $pfp_avg_max_slowdown = $total_max_slowdown/$num_non_mt_workload; 331 | 332 | my $pfp = $pfp_total_execution_time * $pfp_avg_max_slowdown; 333 | 334 | #Start Generating the csv file: 335 | foreach $workload (sort keys %time) { 336 | $progs=0; 337 | $sum_time=0; 338 | print OUT_FH "$workload"; 339 | 340 | foreach $core (sort {$a<=> $b} keys %{$time{$workload}}){ 341 | $sum_time += $time{$workload}{$core}; 342 | } 343 | print OUT_FH ",$sum_time"; 344 | 345 | foreach $core (sort {$a<=> $b} keys %{$time{$workload}}){ 346 | print OUT_FH ",$time{$workload}{$core}"; 347 | $progs++; 348 | } 349 | for ($i=$progs; $i<= $max_progs; $i++) { 350 | print OUT_FH ","; 351 | } 352 | 353 | print OUT_FH ",$max_slowdown{$workload}"; 354 | print OUT_FH ",$min_slowdown{$workload}"; 355 | $progs=0; 356 | foreach $core (sort {$a<=> $b} keys %{$time{$workload}}){ 357 | if ($slowdown{$workload}{$core} > 0) { 358 | print OUT_FH ",$slowdown{$workload}{$core}"; 359 | } else { 360 | print OUT_FH ","; 361 | } 362 | $progs++; 363 | } 364 | for ($i=$progs; $i<= $max_progs; $i++) { 365 | print OUT_FH ","; 366 | } 367 | print OUT_FH "\n"; 368 | 369 | } 370 | 371 | print OUT_FH "\n\nTotal Exection Time, $total_execution_time\n"; 372 | print OUT_FH "PFP Avg Max Slowdown, $pfp_avg_max_slowdown\n"; 373 | print OUT_FH "PFP, $pfp\n\n\n"; 374 | 375 | #Print the EDP stats into the CSV 376 | 377 | print OUT_FH "Work Load, EDP\n"; 378 | foreach $workload (sort keys %edp) { 379 | print OUT_FH "$workload, $edp{$workload}\n"; 380 | } 381 | print OUT_FH "Total, $total_edp\n"; 382 | 383 | print "#----------------------------------------------\n"; 384 | print "Total_execution_time = $total_execution_time\n"; 385 | print "PFP Total_execution_time = $pfp_total_execution_time\n"; 386 | print "PFP = $pfp\n"; 387 | print "PFP Average Max Slowdown = $pfp_avg_max_slowdown\n"; 388 | print "Total EDP = $total_edp\n"; 389 | print "#----------------------------------------------\n"; 390 | 391 | 392 | close OUT_FH; 393 | 394 | -------------------------------------------------------------------------------- /parse_results.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/python 2 | 3 | import os 4 | import sys 5 | import re 6 | import csv 7 | 8 | SINGLE_THREAD_TIME = {'A': 424330872, 9 | 'B': 357830245, 10 | 'C': 645730097, 11 | 'D': 362998160, 12 | 'E': 377036457 } 13 | 14 | BENCHMARKS = ['AAAA', 'BBBB', 'CCCC', 'DDDD', 'EEEE', 'ABCD', 'ABCE', 'ABDE', 'ACDE', 'BCDE'] 15 | 16 | 17 | def write_results(data, filename): 18 | edp_data = [['Workload (EDP)']] 19 | tot_num_cycle_data = [['Workload (Total Num Cycles)']] 20 | max_slow_data = [['Workload (Max Slowdown)']] 21 | 22 | for bench in BENCHMARKS: 23 | edp_data.append([bench]) 24 | tot_num_cycle_data.append([bench]) 25 | max_slow_data.append([bench]) 26 | 27 | with open(filename, 'w') as results_file: 28 | results_writer = csv.writer(results_file) 29 | results_writer.writerow(['Scheduler', 'Total Num Cycles', 'Avg. Max Slowdown', 'PFP', 'Total EDP']) 30 | for config, config_data in iter(sorted(data.iteritems())): 31 | with open('%s_results.csv' % config, 'w') as config_file: 32 | config_writer = csv.writer(config_file) 33 | config_writer.writerow(['Benchmark', 'Core 0 Cycles', 'Core 1 Cycles', 'Core 2 Cycles', 'Core 3 Cycles', 34 | 'Max Slowdown', 'Core 0 Slowdown', 'Core 1 Slowdown', 'Core 2 Slowdown', 'Core 3 Slowdown']) 35 | i = 0 36 | edp_data[i].append(config) 37 | tot_num_cycle_data[i].append(config) 38 | max_slow_data[i].append(config) 39 | i = 1 40 | for bench in BENCHMARKS: 41 | row = [bench] 42 | bench_data = config_data[bench] 43 | for cycle in bench_data['CYCLES']: 44 | row.append(cycle) 45 | 46 | row.append(bench_data['MAX_SLOWDOWN']) 47 | for slowdown in bench_data['SLOWDOWN']: 48 | row.append(slowdown) 49 | 50 | config_writer.writerow(row) 51 | 52 | edp_data[i].append(bench_data['EDP']) 53 | tot_num_cycle_data[i].append(bench_data['TOTAL_NUM_CYCLES']) 54 | max_slow_data[i].append(bench_data['MAX_SLOWDOWN']) 55 | 56 | i = i + 1 57 | 58 | config_writer.writerow([]) 59 | config_writer.writerow(['TOTAL_NUM_CYCLES', config_data['TOTAL_NUM_CYCLES']]) 60 | config_writer.writerow(['AVG_MAX_SLOWDOWN', config_data['AVG_MAX_SLOWDOWN']]) 61 | config_writer.writerow(['PFP', config_data['PFP']]) 62 | config_writer.writerow([]) 63 | config_writer.writerow(['Benchmark', 'EDP']) 64 | for bench in BENCHMARKS: 65 | row = [bench, config_data[bench]['EDP']] 66 | config_writer.writerow(row) 67 | config_writer.writerow(['Total', config_data['TOTAL_EDP']]) 68 | 69 | results_writer.writerow([config, config_data['TOTAL_NUM_CYCLES'], 70 | config_data['AVG_MAX_SLOWDOWN'], 71 | config_data['PFP'], 72 | config_data['TOTAL_EDP']]) 73 | 74 | with open('edp_results.csv', 'w') as edp_file: 75 | edp_writer = csv.writer(edp_file) 76 | for row in edp_data: 77 | edp_writer.writerow(row) 78 | 79 | with open('max_slow_results.csv', 'w') as max_slow_file: 80 | max_slow_writer = csv.writer(max_slow_file) 81 | for row in max_slow_data: 82 | max_slow_writer.writerow(row) 83 | 84 | with open('tot_num_cycle_results.csv', 'w') as tot_num_cycle_file: 85 | tot_cycle_writer = csv.writer(tot_num_cycle_file) 86 | for row in tot_num_cycle_data: 87 | tot_cycle_writer.writerow(row) 88 | 89 | 90 | def parse_file(filename, core_map): 91 | data = None 92 | cycle_regex = re.compile(r'Done: Core (?P\d+): Fetched \d+ : Committed \d+ : At time : (?P\d+)') 93 | edp_regex = re.compile(r'Energy Delay product \(EDP\) = (?P\d*\.?\d+) J\.s') 94 | 95 | max_slowdown = 0 96 | total_num_cycles = 0 97 | with open(filename, 'r') as data_file: 98 | data = {'EDP': 0, 'TOTAL_NUM_CYCLES': 0, 'CYCLES': [0]*4, 'MAX_SLOWDOWN': 0, 'SLOWDOWN': [0]*4} 99 | for line in data_file: 100 | match = cycle_regex.match(line) 101 | if match: 102 | core = match.group('core_id') 103 | cycles = int(match.group('cycle')) 104 | slowdown = cycles / float(SINGLE_THREAD_TIME[core_map[core]]) 105 | data['CYCLES'][int(core)] = cycles 106 | data['SLOWDOWN'][int(core)] = slowdown 107 | if max_slowdown < slowdown: 108 | max_slowdown = slowdown 109 | total_num_cycles += cycles 110 | 111 | match = edp_regex.match(line) 112 | if match: 113 | data['EDP'] = float(match.group('edp')) 114 | 115 | data['MAX_SLOWDOWN'] = max_slowdown 116 | data['TOTAL_NUM_CYCLES'] = total_num_cycles 117 | return data 118 | 119 | def parse_config(directory): 120 | config_data = {'BENCHMARKS': {}} 121 | total_num_cycles = 0 122 | avg_max_slowdown = 0 123 | total_edp = 0 124 | for bench in BENCHMARKS: 125 | core_map = {} 126 | core_id = 0 127 | data_file = '' 128 | for letter in bench: 129 | core_map[str(core_id)] = letter 130 | core_id = core_id + 1 131 | data_file += 'c' + letter + '-' 132 | 133 | data_file += '4_' + directory 134 | 135 | bench_data = parse_file(os.path.join(bench, data_file), core_map) 136 | config_data[bench] = bench_data 137 | total_num_cycles += bench_data['TOTAL_NUM_CYCLES'] 138 | avg_max_slowdown += bench_data['MAX_SLOWDOWN'] 139 | total_edp += bench_data['EDP'] 140 | 141 | avg_max_slowdown = avg_max_slowdown / len(BENCHMARKS) 142 | pfp = avg_max_slowdown * total_num_cycles 143 | 144 | config_data['TOTAL_NUM_CYCLES'] = total_num_cycles 145 | config_data['AVG_MAX_SLOWDOWN'] = avg_max_slowdown 146 | config_data['PFP'] = pfp 147 | config_data['TOTAL_EDP'] = total_edp 148 | return config_data 149 | 150 | def parse_results(dirs): 151 | cur_dir = os.getcwd() 152 | if len(dirs) == 0: 153 | config_dirs = [x for x in os.listdir(cur_dir) if os.path.isdir(x)] 154 | config_dirs.sort() 155 | else: 156 | config_dirs = dirs 157 | 158 | results = {} 159 | 160 | for configuration in config_dirs: 161 | os.chdir(configuration) 162 | config_data = parse_config(configuration) 163 | results[configuration] = config_data 164 | os.chdir(cur_dir) 165 | 166 | write_results(results, 'results.csv') 167 | 168 | if __name__ == "__main__": 169 | dirs = [] 170 | if len(sys.argv) > 1: 171 | dirs = sys.argv[1:] 172 | parse_results(dirs) 173 | -------------------------------------------------------------------------------- /runfair: -------------------------------------------------------------------------------- 1 | 2 | bin/scheduler-fair input/1channel.cfg input/comm2 > output-fair/c2-1 & 3 | bin/scheduler-fair input/1channel.cfg input/comm1 input/comm1 > output-fair/c1-c1-1 & 4 | bin/scheduler-fair input/1channel.cfg input/comm1 input/comm1 input/comm2 input/comm2 > output-fair/c1-c1-c2-c2-1 & 5 | bin/scheduler-fair input/1channel.cfg input/MT0-canneal input/MT1-canneal input/MT2-canneal input/MT3-canneal > output-fair/MTc-1 & 6 | bin/scheduler-fair input/1channel.cfg input/fluid input/swapt input/comm2 input/comm2 > output-fair/fl-sw-c2-c2-1 & 7 | bin/scheduler-fair input/1channel.cfg input/face input/face input/ferret input/ferret > output-fair/fa-fa-fe-fe-1 & 8 | bin/scheduler-fair input/1channel.cfg input/black input/black input/freq input/freq > output-fair/bl-bl-fr-fr-1 & 9 | bin/scheduler-fair input/1channel.cfg input/stream input/stream input/stream input/stream > output-fair/st-st-st-st-1 & 10 | 11 | bin/scheduler-fair input/4channel.cfg input/comm2 > output-fair/c2-4 & 12 | bin/scheduler-fair input/4channel.cfg input/comm1 input/comm1 > output-fair/c1-c1-4 & 13 | bin/scheduler-fair input/4channel.cfg input/comm1 input/comm1 input/comm2 input/comm2 > output-fair/c1-c1-c2-c2-4 & 14 | bin/scheduler-fair input/4channel.cfg input/MT0-canneal input/MT1-canneal input/MT2-canneal input/MT3-canneal > output-fair/MTc-4 & 15 | bin/scheduler-fair input/4channel.cfg input/fluid input/swapt input/comm2 input/comm2 > output-fair/fl-sw-c2-c2-4 & 16 | bin/scheduler-fair input/4channel.cfg input/face input/face input/ferret input/ferret > output-fair/fa-fa-fe-fe-4 & 17 | bin/scheduler-fair input/4channel.cfg input/black input/black input/freq input/freq > output-fair/bl-bl-fr-fr-4 & 18 | bin/scheduler-fair input/4channel.cfg input/stream input/stream input/stream input/stream > output-fair/st-st-st-st-4 & 19 | bin/scheduler-fair input/4channel.cfg input/fluid input/fluid input/swapt input/swapt input/comm2 input/comm2 input/ferret input/ferret > output-fair/fl-fl-sw-sw-c2-c2-fe-fe-4 & 20 | bin/scheduler-fair input/4channel.cfg input/fluid input/fluid input/swapt input/swapt input/comm2 input/comm2 input/ferret input/ferret input/black input/black input/freq input/freq input/comm1 input/comm1 input/stream input/stream > output-fair/fl-fl-sw-sw-c2-c2-fe-fe-bl-bl-fr-fr-c1-c1-st-st-4 & 21 | 22 | -------------------------------------------------------------------------------- /runsim: -------------------------------------------------------------------------------- 1 | 2 | bin/scheduler-close input/1channel.cfg input/comm2 > output/c2-1 & 3 | bin/scheduler-fcfs input/1channel.cfg input/comm1 input/comm1 > output/c1-c1-1 & 4 | 5 | -------------------------------------------------------------------------------- /runstride: -------------------------------------------------------------------------------- 1 | 2 | bin/scheduler-stride input/1channel.cfg input/comm2 > output-stride/c2-1 & 3 | bin/scheduler-stride input/1channel.cfg input/comm1 input/comm1 > output-stride/c1-c1-1 & 4 | bin/scheduler-stride input/1channel.cfg input/comm1 input/comm1 input/comm2 input/comm2 > output-stride/c1-c1-c2-c2-1 & 5 | bin/scheduler-stride input/1channel.cfg input/MT0-canneal input/MT1-canneal input/MT2-canneal input/MT3-canneal > output-stride/MTc-1 & 6 | bin/scheduler-stride input/1channel.cfg input/fluid input/swapt input/comm2 input/comm2 > output-stride/fl-sw-c2-c2-1 & 7 | bin/scheduler-stride input/1channel.cfg input/face input/face input/ferret input/ferret > output-stride/fa-fa-fe-fe-1 & 8 | bin/scheduler-stride input/1channel.cfg input/black input/black input/freq input/freq > output-stride/bl-bl-fr-fr-1 & 9 | bin/scheduler-stride input/1channel.cfg input/stream input/stream input/stream input/stream > output-stride/st-st-st-st-1 & 10 | 11 | bin/scheduler-stride input/4channel.cfg input/comm2 > output-stride/c2-4 & 12 | bin/scheduler-stride input/4channel.cfg input/comm1 input/comm1 > output-stride/c1-c1-4 & 13 | bin/scheduler-stride input/4channel.cfg input/comm1 input/comm1 input/comm2 input/comm2 > output-stride/c1-c1-c2-c2-4 & 14 | bin/scheduler-stride input/4channel.cfg input/MT0-canneal input/MT1-canneal input/MT2-canneal input/MT3-canneal > output-stride/MTc-4 & 15 | bin/scheduler-stride input/4channel.cfg input/fluid input/swapt input/comm2 input/comm2 > output-stride/fl-sw-c2-c2-4 & 16 | bin/scheduler-stride input/4channel.cfg input/face input/face input/ferret input/ferret > output-stride/fa-fa-fe-fe-4 & 17 | bin/scheduler-stride input/4channel.cfg input/black input/black input/freq input/freq > output-stride/bl-bl-fr-fr-4 & 18 | bin/scheduler-stride input/4channel.cfg input/stream input/stream input/stream input/stream > output-stride/st-st-st-st-4 & 19 | bin/scheduler-stride input/4channel.cfg input/fluid input/fluid input/swapt input/swapt input/comm2 input/comm2 input/ferret input/ferret > output-stride/fl-fl-sw-sw-c2-c2-fe-fe-4 & 20 | bin/scheduler-stride input/4channel.cfg input/fluid input/fluid input/swapt input/swapt input/comm2 input/comm2 input/ferret input/ferret input/black input/black input/freq input/freq input/comm1 input/comm1 input/stream input/stream > output-stride/fl-fl-sw-sw-c2-c2-fe-fe-bl-bl-fr-fr-c1-c1-st-st-4 & 21 | 22 | -------------------------------------------------------------------------------- /src/Makefile: -------------------------------------------------------------------------------- 1 | SRCS=main.c memory_controller.c 2 | OBJS=$(addprefix $(OUT_DIR)/, $(patsubst %.c, %.o, $(SRCS))) 3 | # TODO : Make this to the prefix of your target files. EX: scheduler 4 | NAME_RULE="scheduler-*.c" 5 | TARGETS=$(shell ls "./$(NAME_RULE)" | awk '{print $$1}') 6 | 7 | # TODO : Choose your directories 8 | SRC_DIR=./ 9 | OUT_DIR=../obj 10 | OUT_BIN_DIR=../bin 11 | ifndef CAPN 12 | CAPN=1 13 | endif 14 | ifndef PWRN 15 | PWRN=1 16 | endif 17 | 18 | CFLAGS=-O3 -std=c99 -Wall 19 | 20 | all : $(TARGETS) 21 | 22 | # General make 23 | $(TARGETS) : $(OBJS) 24 | @echo "building $* ..." 25 | @mkdir -p $(OUT_BIN_DIR) 26 | $(CC) $(CFLAGS) -o $(OUT_BIN_DIR)/$* $(OBJS) $@ 27 | 28 | # Make yours if you define variables which is conflict with others or you want to name by your-self. 29 | scheduler-frfcfs.c : $(OBJS) 30 | @echo "building $* ..." 31 | @mkdir -p $(OUT_BIN_DIR) 32 | $(CC) $(CFLAGS) -DCAPN=$(CAPN) -o $(OUT_BIN_DIR)/$*-$(CAPN) $(OBJS) $@ 33 | 34 | scheduler-pwrdn.c : $(OBJS) 35 | @echo "building $* ..." 36 | @mkdir -p $(OUT_BIN_DIR) 37 | $(CC) $(CFLAGS) -DPWRN=$(PWRN) -o $(OUT_BIN_DIR)/$*-$(PWRN) $(OBJS) $@ 38 | 39 | $(OUT_DIR)/%.o : %.c 40 | @mkdir -p $(@D) 41 | $(CC) $(CFLAGS) -c $< -o $@ 42 | 43 | clean : 44 | rm -rf $(OUT_DIR) $(OUT_BIN_DIR) 45 | 46 | -------------------------------------------------------------------------------- /src/configfile.h: -------------------------------------------------------------------------------- 1 | 2 | #ifndef __CONFIG_FILE_IN_H__ 3 | #define __CONFIG_FILE_IN_H__ 4 | 5 | #include "params.h" 6 | 7 | #define EOL 10 8 | #define CR 13 9 | #define SPACE 32 10 | #define TAB 9 11 | 12 | typedef enum { 13 | processor_clk_multiplier_token, 14 | robsize_token, 15 | max_retire_token, 16 | max_fetch_token, 17 | pipelinedepth_token, 18 | 19 | num_channels_token, 20 | num_ranks_token, 21 | num_banks_token, 22 | num_rows_token, 23 | num_columns_token, 24 | cache_line_size_token, 25 | address_bits_token, 26 | 27 | dram_clk_frequency_token, 28 | t_rcd_token, 29 | t_rp_token, 30 | t_cas_token, 31 | t_rc_token, 32 | t_ras_token, 33 | t_rrd_token, 34 | t_faw_token, 35 | t_wr_token, 36 | t_wtr_token, 37 | t_rtp_token, 38 | t_ccd_token, 39 | t_rfc_token, 40 | t_refi_token, 41 | t_cwd_token, 42 | t_rtrs_token, 43 | t_pd_min_token, 44 | t_xp_token, 45 | t_xp_dll_token, 46 | t_data_trans_token, 47 | 48 | vdd_token, 49 | idd0_token, 50 | idd2p0_token, 51 | idd2p1_token, 52 | idd2n_token, 53 | idd3p_token, 54 | idd3n_token, 55 | idd4r_token, 56 | idd4w_token, 57 | idd5_token, 58 | 59 | wq_capacity_token, 60 | address_mapping_token, 61 | wq_lookup_latency_token, 62 | 63 | comment_token, 64 | unknown_token 65 | }token_t; 66 | 67 | 68 | token_t tokenize(char * input){ 69 | size_t length; 70 | length = strlen(input); 71 | if(strncmp(input, "//",2) == 0) { 72 | return comment_token; 73 | } else if (strncmp(input, "PROCESSOR_CLK_MULTIPLIER",length) == 0) { 74 | return processor_clk_multiplier_token; 75 | } else if (strncmp(input, "ROBSIZE",length) == 0) { 76 | return robsize_token; 77 | } else if (strncmp(input, "MAX_RETIRE",length) == 0) { 78 | return max_retire_token; 79 | } else if (strncmp(input, "MAX_FETCH",length) == 0) { 80 | return max_fetch_token; 81 | } else if (strncmp(input, "PIPELINEDEPTH",length) == 0) { 82 | return pipelinedepth_token; 83 | } else if (strncmp(input, "NUM_CHANNELS",length) == 0) { 84 | return num_channels_token; 85 | } else if (strncmp(input, "NUM_RANKS",length) == 0) { 86 | return num_ranks_token; 87 | } else if (strncmp(input, "NUM_BANKS",length) == 0) { 88 | return num_banks_token; 89 | } else if (strncmp(input, "NUM_ROWS",length) == 0) { 90 | return num_rows_token; 91 | } else if (strncmp(input, "NUM_COLUMNS",length) == 0) { 92 | return num_columns_token; 93 | } else if (strncmp(input, "CACHE_LINE_SIZE",length) == 0) { 94 | return cache_line_size_token; 95 | } else if (strncmp(input, "ADDRESS_BITS",length) == 0) { 96 | return address_bits_token; 97 | } else if (strncmp(input, "DRAM_CLK_FREQUENCY",length) == 0) { 98 | return dram_clk_frequency_token; 99 | } else if (strncmp(input, "T_RC",length) == 0) { 100 | return t_rc_token; 101 | } else if (strncmp(input, "T_RP",length) == 0) { 102 | return t_rp_token; 103 | } else if (strncmp(input, "T_CAS",length) == 0) { 104 | return t_cas_token; 105 | } else if (strncmp(input, "T_RCD",length) == 0) { 106 | return t_rcd_token; 107 | } else if (strncmp(input, "T_RAS",length) == 0) { 108 | return t_ras_token; 109 | } else if (strncmp(input, "T_RRD",length) == 0) { 110 | return t_rrd_token; 111 | } else if (strncmp(input, "T_FAW",length) == 0) { 112 | return t_faw_token; 113 | } else if (strncmp(input, "T_WR",length) == 0) { 114 | return t_wr_token; 115 | } else if (strncmp(input, "T_WTR",length) == 0) { 116 | return t_wtr_token; 117 | } else if (strncmp(input, "T_RTP",length) == 0) { 118 | return t_rtp_token; 119 | } else if (strncmp(input, "T_CCD",length) == 0) { 120 | return t_ccd_token; 121 | } else if (strncmp(input, "T_RFC",length) == 0) { 122 | return t_rfc_token; 123 | } else if (strncmp(input, "T_REFI",length) == 0) { 124 | return t_refi_token; 125 | } else if (strncmp(input, "T_CWD",length) == 0) { 126 | return t_cwd_token; 127 | } else if (strncmp(input, "T_RTRS",length) == 0) { 128 | return t_rtrs_token; 129 | } else if (strncmp(input, "T_PD_MIN",length) == 0) { 130 | return t_pd_min_token; 131 | } else if (strncmp(input, "T_XP",length) == 0) { 132 | return t_xp_token; 133 | } else if (strncmp(input, "T_XP_DLL",length) == 0) { 134 | return t_xp_dll_token; 135 | } else if (strncmp(input, "T_DATA_TRANS",length) == 0) { 136 | return t_data_trans_token; 137 | } else if (strncmp(input, "VDD",length) == 0) { 138 | return vdd_token; 139 | } else if (strncmp(input, "IDD0",length) == 0) { 140 | return idd0_token; 141 | } else if (strncmp(input, "IDD2P0",length) == 0) { 142 | return idd2p0_token; 143 | } else if (strncmp(input, "IDD2P1",length) == 0) { 144 | return idd2p1_token; 145 | } else if (strncmp(input, "IDD2N",length) == 0) { 146 | return idd2n_token; 147 | } else if (strncmp(input, "IDD3P",length) == 0) { 148 | return idd3p_token; 149 | } else if (strncmp(input, "IDD3N",length) == 0) { 150 | return idd3n_token; 151 | } else if (strncmp(input, "IDD4R",length) == 0) { 152 | return idd4r_token; 153 | } else if (strncmp(input, "IDD4W",length) == 0) { 154 | return idd4w_token; 155 | } else if (strncmp(input, "IDD5",length) == 0) { 156 | return idd5_token; 157 | } else if (strncmp(input, "WQ_CAPACITY",length) == 0) { 158 | return wq_capacity_token; 159 | } else if (strncmp(input, "ADDRESS_MAPPING",length) == 0) { 160 | return address_mapping_token; 161 | } else if (strncmp(input, "WQ_LOOKUP_LATENCY",length) == 0) { 162 | return wq_lookup_latency_token; 163 | } 164 | 165 | else { 166 | printf("PANIC :Unknown token %s\n",input); 167 | return unknown_token; 168 | } 169 | } 170 | 171 | 172 | void read_config_file(FILE * fin) 173 | { 174 | char c; 175 | char input_string[256]; 176 | int input_int; 177 | float input_float; 178 | 179 | while ((c = fgetc(fin)) != EOF){ 180 | if((c != EOL) && (c != CR) && (c != SPACE) && (c != TAB)){ 181 | fscanf(fin,"%s",&input_string[1]); 182 | input_string[0] = c; 183 | } else { 184 | fscanf(fin,"%s",&input_string[0]); 185 | } 186 | token_t input_field = tokenize(&input_string[0]); 187 | 188 | switch(input_field) 189 | { 190 | case comment_token: 191 | while (((c = fgetc(fin)) != EOL) && (c != EOF)){ 192 | /*comment, to be ignored */ 193 | } 194 | break; 195 | 196 | case processor_clk_multiplier_token: 197 | fscanf(fin,"%d",&input_int); 198 | PROCESSOR_CLK_MULTIPLIER = input_int; 199 | break; 200 | 201 | case robsize_token: 202 | fscanf(fin,"%d",&input_int); 203 | ROBSIZE = input_int; 204 | break; 205 | 206 | case max_retire_token: 207 | fscanf(fin,"%d",&input_int); 208 | MAX_RETIRE = input_int; 209 | break; 210 | 211 | case max_fetch_token: 212 | fscanf(fin,"%d",&input_int); 213 | MAX_FETCH = input_int; 214 | break; 215 | 216 | case pipelinedepth_token: 217 | fscanf(fin,"%d",&input_int); 218 | PIPELINEDEPTH = input_int; 219 | break; 220 | 221 | 222 | case num_channels_token: 223 | fscanf(fin,"%d",&input_int); 224 | NUM_CHANNELS = input_int; 225 | break; 226 | 227 | 228 | case num_ranks_token: 229 | fscanf(fin,"%d",&input_int); 230 | NUM_RANKS = input_int; 231 | break; 232 | 233 | case num_banks_token: 234 | fscanf(fin,"%d",&input_int); 235 | NUM_BANKS = input_int; 236 | break; 237 | 238 | case num_rows_token: 239 | fscanf(fin,"%d",&input_int); 240 | NUM_ROWS = input_int; 241 | break; 242 | 243 | case num_columns_token: 244 | fscanf(fin,"%d",&input_int); 245 | NUM_COLUMNS = input_int; 246 | break; 247 | 248 | case cache_line_size_token: 249 | fscanf(fin,"%d",&input_int); 250 | CACHE_LINE_SIZE = input_int; 251 | break; 252 | 253 | case address_bits_token: 254 | fscanf(fin,"%d",&input_int); 255 | ADDRESS_BITS = input_int; 256 | break; 257 | 258 | case dram_clk_frequency_token: 259 | fscanf(fin,"%d",&input_int); 260 | DRAM_CLK_FREQUENCY = input_int; 261 | break; 262 | 263 | case t_rcd_token: 264 | fscanf(fin,"%d",&input_int); 265 | T_RCD = input_int*PROCESSOR_CLK_MULTIPLIER; 266 | break; 267 | 268 | case t_rp_token: 269 | fscanf(fin,"%d",&input_int); 270 | T_RP = input_int*PROCESSOR_CLK_MULTIPLIER; 271 | break; 272 | 273 | case t_cas_token: 274 | fscanf(fin,"%d",&input_int); 275 | T_CAS = input_int*PROCESSOR_CLK_MULTIPLIER; 276 | break; 277 | 278 | case t_rc_token: 279 | fscanf(fin,"%d",&input_int); 280 | T_RC = input_int*PROCESSOR_CLK_MULTIPLIER; 281 | break; 282 | 283 | case t_ras_token: 284 | fscanf(fin,"%d",&input_int); 285 | T_RAS = input_int*PROCESSOR_CLK_MULTIPLIER; 286 | break; 287 | 288 | case t_rrd_token: 289 | fscanf(fin,"%d",&input_int); 290 | T_RRD = input_int*PROCESSOR_CLK_MULTIPLIER; 291 | break; 292 | 293 | case t_faw_token: 294 | fscanf(fin,"%d",&input_int); 295 | T_FAW = input_int*PROCESSOR_CLK_MULTIPLIER; 296 | break; 297 | 298 | case t_wr_token: 299 | fscanf(fin,"%d",&input_int); 300 | T_WR = input_int*PROCESSOR_CLK_MULTIPLIER; 301 | break; 302 | 303 | case t_wtr_token: 304 | fscanf(fin,"%d",&input_int); 305 | T_WTR = input_int*PROCESSOR_CLK_MULTIPLIER; 306 | break; 307 | 308 | case t_rtp_token: 309 | fscanf(fin,"%d",&input_int); 310 | T_RTP = input_int*PROCESSOR_CLK_MULTIPLIER; 311 | break; 312 | 313 | case t_ccd_token: 314 | fscanf(fin,"%d",&input_int); 315 | T_CCD = input_int*PROCESSOR_CLK_MULTIPLIER; 316 | break; 317 | 318 | case t_rfc_token: 319 | fscanf(fin,"%d",&input_int); 320 | T_RFC = input_int*PROCESSOR_CLK_MULTIPLIER; 321 | break; 322 | 323 | case t_refi_token: 324 | fscanf(fin,"%d",&input_int); 325 | T_REFI = input_int*PROCESSOR_CLK_MULTIPLIER; 326 | break; 327 | 328 | case t_cwd_token: 329 | fscanf(fin,"%d",&input_int); 330 | T_CWD = input_int*PROCESSOR_CLK_MULTIPLIER; 331 | break; 332 | 333 | case t_rtrs_token: 334 | fscanf(fin,"%d",&input_int); 335 | T_RTRS = input_int*PROCESSOR_CLK_MULTIPLIER; 336 | break; 337 | 338 | case t_pd_min_token: 339 | fscanf(fin,"%d",&input_int); 340 | T_PD_MIN = input_int*PROCESSOR_CLK_MULTIPLIER; 341 | break; 342 | 343 | case t_xp_token: 344 | fscanf(fin,"%d",&input_int); 345 | T_XP = input_int*PROCESSOR_CLK_MULTIPLIER; 346 | break; 347 | 348 | case t_xp_dll_token: 349 | fscanf(fin,"%d",&input_int); 350 | T_XP_DLL = input_int*PROCESSOR_CLK_MULTIPLIER; 351 | break; 352 | 353 | case t_data_trans_token: 354 | fscanf(fin,"%d",&input_int); 355 | T_DATA_TRANS = input_int*PROCESSOR_CLK_MULTIPLIER; 356 | break; 357 | 358 | case vdd_token: 359 | fscanf(fin,"%f",&input_float); 360 | VDD = input_float; 361 | break; 362 | 363 | case idd0_token: 364 | fscanf(fin,"%f",&input_float); 365 | IDD0 = input_float; 366 | break; 367 | 368 | case idd2p0_token: 369 | fscanf(fin,"%f",&input_float); 370 | IDD2P0 = input_float; 371 | break; 372 | 373 | case idd2p1_token: 374 | fscanf(fin,"%f",&input_float); 375 | IDD2P1 = input_float; 376 | break; 377 | 378 | case idd2n_token: 379 | fscanf(fin,"%f",&input_float); 380 | IDD2N = input_float; 381 | break; 382 | 383 | case idd3p_token: 384 | fscanf(fin,"%f",&input_float); 385 | IDD3P = input_float; 386 | break; 387 | 388 | case idd3n_token: 389 | fscanf(fin,"%f",&input_float); 390 | IDD3N = input_float; 391 | break; 392 | 393 | case idd4r_token: 394 | fscanf(fin,"%f",&input_float); 395 | IDD4R = input_float; 396 | break; 397 | 398 | case idd4w_token: 399 | fscanf(fin,"%f",&input_float); 400 | IDD4W = input_float; 401 | break; 402 | 403 | case idd5_token: 404 | fscanf(fin,"%f",&input_float); 405 | IDD5 = input_float; 406 | break; 407 | 408 | case wq_capacity_token: 409 | fscanf(fin,"%d",&input_int); 410 | WQ_CAPACITY = input_int; 411 | break; 412 | 413 | case address_mapping_token: 414 | fscanf(fin,"%d",&input_int); 415 | ADDRESS_MAPPING= input_int; 416 | break; 417 | 418 | case wq_lookup_latency_token: 419 | fscanf(fin,"%d",&input_int); 420 | WQ_LOOKUP_LATENCY = input_int; 421 | break; 422 | 423 | case unknown_token: 424 | default: 425 | printf("PANIC: bad token in cfg file\n"); 426 | break; 427 | 428 | } 429 | } 430 | } 431 | 432 | 433 | void print_params() 434 | { 435 | printf("----------------------------------------------------------------------------------------\n"); 436 | printf("------------------------\n"); 437 | printf("- SIMULATOR PARAMETERS -\n"); 438 | printf("------------------------\n"); 439 | printf("\n-------------\n"); 440 | printf("- PROCESSOR -\n"); 441 | printf("-------------\n"); 442 | printf("PROCESSOR_CLK_MULTIPLIER: %6d\n", PROCESSOR_CLK_MULTIPLIER); 443 | printf("ROBSIZE: %6d\n", ROBSIZE); 444 | printf("MAX_FETCH: %6d\n", MAX_FETCH); 445 | printf("MAX_RETIRE: %6d\n", MAX_RETIRE); 446 | printf("PIPELINEDEPTH: %6d\n", PIPELINEDEPTH); 447 | 448 | printf("\n---------------\n"); 449 | printf("- DRAM Config -\n"); 450 | printf("---------------\n"); 451 | printf("NUM_CHANNELS: %6d\n", NUM_CHANNELS); 452 | printf("NUM_RANKS: %6d\n", NUM_RANKS); 453 | printf("NUM_BANKS: %6d\n", NUM_BANKS); 454 | printf("NUM_ROWS: %6d\n", NUM_ROWS); 455 | printf("NUM_COLUMNS: %6d\n", NUM_COLUMNS); 456 | 457 | printf("\n---------------\n"); 458 | printf("- DRAM Timing -\n"); 459 | printf("---------------\n"); 460 | printf("T_RCD: %6d\n", T_RCD); 461 | printf("T_RP: %6d\n", T_RP); 462 | printf("T_CAS: %6d\n", T_CAS); 463 | printf("T_RC: %6d\n", T_RC); 464 | printf("T_RAS: %6d\n", T_RAS); 465 | printf("T_RRD: %6d\n", T_RRD); 466 | printf("T_FAW: %6d\n", T_FAW); 467 | printf("T_WR: %6d\n", T_WR); 468 | printf("T_WTR: %6d\n", T_WTR); 469 | printf("T_RTP: %6d\n", T_RTP); 470 | printf("T_CCD: %6d\n", T_CCD); 471 | printf("T_RFC: %6d\n", T_RFC); 472 | printf("T_REFI: %6d\n", T_REFI); 473 | printf("T_CWD: %6d\n", T_CWD); 474 | printf("T_RTRS: %6d\n", T_RTRS); 475 | printf("T_PD_MIN: %6d\n", T_PD_MIN); 476 | printf("T_XP: %6d\n", T_XP); 477 | printf("T_XP_DLL: %6d\n", T_XP_DLL); 478 | printf("T_DATA_TRANS: %6d\n", T_DATA_TRANS); 479 | 480 | printf("\n---------------------------\n"); 481 | printf("- DRAM Idd Specifications -\n"); 482 | printf("---------------------------\n"); 483 | 484 | printf("VDD: %05.2f\n", VDD); 485 | printf("IDD0: %05.2f\n", IDD0); 486 | printf("IDD2P0: %05.2f\n", IDD2P0); 487 | printf("IDD2P1: %05.2f\n", IDD2P1); 488 | printf("IDD2N: %05.2f\n", IDD2N); 489 | printf("IDD3P: %05.2f\n", IDD3P); 490 | printf("IDD3N: %05.2f\n", IDD3N); 491 | printf("IDD4R: %05.2f\n", IDD4R); 492 | printf("IDD4W: %05.2f\n", IDD4W); 493 | printf("IDD5: %05.2f\n", IDD5); 494 | 495 | printf("\n-------------------\n"); 496 | printf("- DRAM Controller -\n"); 497 | printf("-------------------\n"); 498 | printf("WQ_CAPACITY: %6d\n", WQ_CAPACITY); 499 | printf("ADDRESS_MAPPING: %6d\n", ADDRESS_MAPPING); 500 | printf("WQ_LOOKUP_LATENCY: %6d\n", WQ_LOOKUP_LATENCY); 501 | printf("\n----------------------------------------------------------------------------------------\n"); 502 | 503 | 504 | } 505 | 506 | 507 | #endif // __CONFIG_FILE_IN_H__ 508 | -------------------------------------------------------------------------------- /src/main.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | 6 | #include "processor.h" 7 | #include "configfile.h" 8 | #include "memory_controller.h" 9 | #include "scheduler.h" 10 | #include "params.h" 11 | 12 | #define MAXTRACELINESIZE 64 13 | long long int BIGNUM = 1000000; 14 | 15 | 16 | int expt_done = 0; 17 | 18 | long long int CYCLE_VAL = 0; 19 | 20 | long long int 21 | get_current_cycle () 22 | { 23 | return CYCLE_VAL; 24 | } 25 | 26 | struct robstructure *ROB; 27 | 28 | FILE **tif; /* The handles to the trace input files. */ 29 | FILE *config_file; 30 | FILE *vi_file; 31 | 32 | int *prefixtable; 33 | // Moved the following to memory_controller.h so that they are visible 34 | // from the scheduler. 35 | //long long int *committed; 36 | //long long int *fetched; 37 | long long int *time_done; 38 | long long int total_time_done; 39 | float core_power = 0; 40 | 41 | int 42 | main (int argc, char *argv[]) 43 | { 44 | 45 | printf ("---------------------------------------------\n"); 46 | printf ("-- USIMM: the Utah SImulated Memory Module --\n"); 47 | printf ("-- Version: 1.3 --\n"); 48 | printf ("---------------------------------------------\n"); 49 | 50 | int numc = 0; 51 | int num_ret = 0; 52 | int num_fetch = 0; 53 | int num_done = 0; 54 | int numch = 0; 55 | int writeqfull = 0; 56 | int fnstart; 57 | int currMTapp; 58 | long long int maxtd; 59 | int maxcr; 60 | int pow_of_2_cores; 61 | char newstr[MAXTRACELINESIZE]; 62 | int *nonmemops; 63 | char *opertype; 64 | long long int *addr; 65 | long long int *instrpc; 66 | int chips_per_rank = -1; 67 | 68 | /* Initialization code. */ 69 | printf ("Initializing.\n"); 70 | 71 | if (argc < 3) 72 | { 73 | printf 74 | ("Need at least one input configuration file and one trace file as argument. Quitting.\n"); 75 | return -3; 76 | } 77 | 78 | config_file = fopen (argv[1], "r"); 79 | if (!config_file) 80 | { 81 | printf ("Missing system configuration file. Quitting. \n"); 82 | return -4; 83 | } 84 | 85 | NUMCORES = argc - 2; 86 | 87 | 88 | ROB = 89 | (struct robstructure *) malloc (sizeof (struct robstructure) * NUMCORES); 90 | tif = (FILE **) malloc (sizeof (FILE *) * NUMCORES); 91 | committed = (long long int *) malloc (sizeof (long long int) * NUMCORES); 92 | fetched = (long long int *) malloc (sizeof (long long int) * NUMCORES); 93 | time_done = (long long int *) malloc (sizeof (long long int) * NUMCORES); 94 | nonmemops = (int *) malloc (sizeof (int) * NUMCORES); 95 | opertype = (char *) malloc (sizeof (char) * NUMCORES); 96 | addr = (long long int *) malloc (sizeof (long long int) * NUMCORES); 97 | instrpc = (long long int *) malloc (sizeof (long long int) * NUMCORES); 98 | prefixtable = (int *) malloc (sizeof (int) * NUMCORES); 99 | currMTapp = -1; 100 | for (numc = 0; numc < NUMCORES; numc++) 101 | { 102 | tif[numc] = fopen (argv[numc + 2], "r"); 103 | if (!tif[numc]) 104 | { 105 | printf ("Missing input trace file %d. Quitting. \n", numc); 106 | return -5; 107 | } 108 | 109 | /* The addresses in each trace are given a prefix that equals 110 | their core ID. If the input trace starts with "MT", it is 111 | assumed to be part of a multi-threaded app. The addresses 112 | from this trace file are given a prefix that equals that of 113 | the last seen input trace file that starts with "MT0". For 114 | example, the following is an acceptable set of inputs for 115 | multi-threaded apps CG (4 threads) and LU (2 threads): 116 | usimm 1channel.cfg MT0CG MT1CG MT2CG MT3CG MT0LU MT1LU */ 117 | prefixtable[numc] = numc; 118 | 119 | /* Find the start of the filename. It's after the last "/". */ 120 | for (fnstart = strlen (argv[numc + 2]); fnstart >= 0; fnstart--) 121 | { 122 | if (argv[numc + 2][fnstart] == '/') 123 | { 124 | break; 125 | } 126 | } 127 | fnstart++; /* fnstart is either the letter after the last / or the 0th letter. */ 128 | 129 | if ((strlen (argv[numc + 2]) - fnstart) > 2) 130 | { 131 | if ((argv[numc + 2][fnstart + 0] == 'M') 132 | && (argv[numc + 2][fnstart + 1] == 'T')) 133 | { 134 | if (argv[numc + 2][fnstart + 2] == '0') 135 | { 136 | currMTapp = numc; 137 | } 138 | else 139 | { 140 | if (currMTapp < 0) 141 | { 142 | printf 143 | ("Poor set of input parameters. Input file %s starts with \"MT\", but there is no preceding input file starting with \"MT0\". Quitting.\n", 144 | argv[numc + 2]); 145 | return -6; 146 | } 147 | else 148 | prefixtable[numc] = currMTapp; 149 | } 150 | } 151 | } 152 | printf 153 | ("Core %d: Input trace file %s : Addresses will have prefix %d\n", 154 | numc, argv[numc + 2], prefixtable[numc]); 155 | 156 | committed[numc] = 0; 157 | fetched[numc] = 0; 158 | time_done[numc] = 0; 159 | ROB[numc].head = 0; 160 | ROB[numc].tail = 0; 161 | ROB[numc].inflight = 0; 162 | ROB[numc].tracedone = 0; 163 | } 164 | 165 | read_config_file (config_file); 166 | 167 | 168 | /* Find the appropriate .vi file to read */ 169 | if (NUM_CHANNELS == 1 && NUMCORES == 1) 170 | { 171 | vi_file = fopen ("input/1Gb_x4.vi", "r"); 172 | chips_per_rank = 16; 173 | printf ("Reading vi file: 1Gb_x4.vi\t\n%d Chips per Rank\n", 174 | chips_per_rank); 175 | } 176 | else if (NUM_CHANNELS == 1 && NUMCORES == 2) 177 | { 178 | vi_file = fopen ("input/2Gb_x4.vi", "r"); 179 | chips_per_rank = 16; 180 | printf ("Reading vi file: 2Gb_x4.vi\t\n%d Chips per Rank\n", 181 | chips_per_rank); 182 | } 183 | else if (NUM_CHANNELS == 1 && (NUMCORES > 2) && (NUMCORES <= 4)) 184 | { 185 | vi_file = fopen ("input/4Gb_x4.vi", "r"); 186 | chips_per_rank = 16; 187 | printf ("Reading vi file: 4Gb_x4.vi\t\n%d Chips per Rank\n", 188 | chips_per_rank); 189 | } 190 | else if (NUM_CHANNELS == 4 && NUMCORES == 1) 191 | { 192 | vi_file = fopen ("input/1Gb_x16.vi", "r"); 193 | chips_per_rank = 4; 194 | printf ("Reading vi file: 1Gb_x16.vi\t\n%d Chips per Rank\n", 195 | chips_per_rank); 196 | } 197 | else if (NUM_CHANNELS == 4 && NUMCORES == 2) 198 | { 199 | vi_file = fopen ("input/1Gb_x8.vi", "r"); 200 | chips_per_rank = 8; 201 | printf ("Reading vi file: 1Gb_x8.vi\t\n%d Chips per Rank\n", 202 | chips_per_rank); 203 | } 204 | else if (NUM_CHANNELS == 4 && (NUMCORES > 2) && (NUMCORES <= 4)) 205 | { 206 | vi_file = fopen ("input/2Gb_x8.vi", "r"); 207 | chips_per_rank = 8; 208 | printf ("Reading vi file: 2Gb_x8.vi\t\n%d Chips per Rank\n", 209 | chips_per_rank); 210 | } 211 | else if (NUM_CHANNELS == 4 && (NUMCORES > 4) && (NUMCORES <= 8)) 212 | { 213 | vi_file = fopen ("input/4Gb_x8.vi", "r"); 214 | chips_per_rank = 8; 215 | printf ("Reading vi file: 4Gb_x8.vi\t\n%d Chips per Rank\n", 216 | chips_per_rank); 217 | } 218 | else if (NUM_CHANNELS == 4 && (NUMCORES > 8) && (NUMCORES <= 16)) 219 | { 220 | vi_file = fopen ("input/4Gb_x4.vi", "r"); 221 | chips_per_rank = 16; 222 | printf ("Reading vi file: 4Gb_x4.vi\t\n%d Chips per Rank\n", 223 | chips_per_rank); 224 | } 225 | else 226 | { 227 | printf ("PANIC:: Channel - Core configuration not supported\n"); 228 | assert (-1); 229 | } 230 | 231 | if (!vi_file) 232 | { 233 | printf ("Missing DRAM chip parameter file. Quitting. \n"); 234 | return -5; 235 | } 236 | 237 | 238 | 239 | assert ((log_base2 (NUM_CHANNELS) + log_base2 (NUM_RANKS) + 240 | log_base2 (NUM_BANKS) + log_base2 (NUM_ROWS) + 241 | log_base2 (NUM_COLUMNS) + log_base2 (CACHE_LINE_SIZE)) == 242 | ADDRESS_BITS); 243 | /* Increase the address space and rows per bank depending on the number of input traces. */ 244 | ADDRESS_BITS = ADDRESS_BITS + log_base2 (NUMCORES); 245 | if (NUMCORES == 1) 246 | { 247 | pow_of_2_cores = 1; 248 | } 249 | else 250 | { 251 | pow_of_2_cores = 1 << ((int) log_base2 (NUMCORES - 1) + 1); 252 | } 253 | NUM_ROWS = NUM_ROWS * pow_of_2_cores; 254 | 255 | read_config_file (vi_file); 256 | print_params (); 257 | 258 | for (int i = 0; i < NUMCORES; i++) 259 | { 260 | ROB[i].comptime = 261 | (long long int *) malloc (sizeof (long long int) * ROBSIZE); 262 | ROB[i].mem_address = 263 | (long long int *) malloc (sizeof (long long int) * ROBSIZE); 264 | ROB[i].instrpc = 265 | (long long int *) malloc (sizeof (long long int) * ROBSIZE); 266 | ROB[i].optype = (int *) malloc (sizeof (int) * ROBSIZE); 267 | } 268 | init_memory_controller_vars (); 269 | init_scheduler_vars (); 270 | /* Done initializing. */ 271 | 272 | /* Must start by reading one line of each trace file. */ 273 | for (numc = 0; numc < NUMCORES; numc++) 274 | { 275 | if (fgets (newstr, MAXTRACELINESIZE, tif[numc])) 276 | { 277 | if (sscanf (newstr, "%d %c", &nonmemops[numc], &opertype[numc]) > 0) 278 | { 279 | if (opertype[numc] == 'R') 280 | { 281 | if (sscanf 282 | (newstr, "%d %c %Lx %Lx", &nonmemops[numc], 283 | &opertype[numc], &addr[numc], &instrpc[numc]) < 1) 284 | { 285 | printf ("Panic. Poor trace format.\n"); 286 | return -4; 287 | } 288 | } 289 | else 290 | { 291 | if (opertype[numc] == 'W') 292 | { 293 | if (sscanf 294 | (newstr, "%d %c %Lx", &nonmemops[numc], 295 | &opertype[numc], &addr[numc]) < 1) 296 | { 297 | printf ("Panic. Poor trace format.\n"); 298 | return -3; 299 | } 300 | } 301 | else 302 | { 303 | printf ("Panic. Poor trace format.\n"); 304 | return -2; 305 | } 306 | } 307 | } 308 | else 309 | { 310 | printf ("Panic. Poor trace format.\n"); 311 | return -1; 312 | } 313 | } 314 | else 315 | { 316 | if (ROB[numc].inflight == 0) 317 | { 318 | num_done++; 319 | if (!time_done[numc]) 320 | time_done[numc] = 1; 321 | } 322 | ROB[numc].tracedone = 1; 323 | } 324 | } 325 | 326 | 327 | printf ("Starting simulation.\n"); 328 | while (!expt_done) 329 | { 330 | 331 | /* For each core, retire instructions if they have finished. */ 332 | for (numc = 0; numc < NUMCORES; numc++) 333 | { 334 | num_ret = 0; 335 | while ((num_ret < MAX_RETIRE) && ROB[numc].inflight) 336 | { 337 | /* Keep retiring until retire width is consumed or ROB is empty. */ 338 | if (ROB[numc].comptime[ROB[numc].head] < CYCLE_VAL) 339 | { 340 | /* Keep retiring instructions if they are done. */ 341 | ROB[numc].head = (ROB[numc].head + 1) % ROBSIZE; 342 | ROB[numc].inflight--; 343 | committed[numc]++; 344 | num_ret++; 345 | } 346 | else /* Instruction not complete. Stop retirement for this core. */ 347 | break; 348 | } /* End of while loop that is retiring instruction for one core. */ 349 | } /* End of for loop that is retiring instructions for all cores. */ 350 | 351 | 352 | if (CYCLE_VAL % PROCESSOR_CLK_MULTIPLIER == 0) 353 | { 354 | /* Execute function to find ready instructions. */ 355 | update_memory (); 356 | 357 | /* Execute user-provided function to select ready instructions for issue. */ 358 | /* Based on this selection, update DRAM data structures and set 359 | instruction completion times. */ 360 | for (int c = 0; c < NUM_CHANNELS; c++) 361 | { 362 | schedule (c); 363 | gather_stats (c); 364 | } 365 | } 366 | 367 | /* For each core, bring in new instructions from the trace file to 368 | fill up the ROB. */ 369 | num_done = 0; 370 | writeqfull = 0; 371 | for (int c = 0; c < NUM_CHANNELS; c++) 372 | { 373 | if (write_queue_length[c] == WQ_CAPACITY) 374 | { 375 | writeqfull = 1; 376 | break; 377 | } 378 | } 379 | 380 | for (numc = 0; numc < NUMCORES; numc++) 381 | { 382 | if (!ROB[numc].tracedone) 383 | { /* Try to fetch if EOF has not been encountered. */ 384 | num_fetch = 0; 385 | while ((num_fetch < MAX_FETCH) 386 | && (ROB[numc].inflight != ROBSIZE) && (!writeqfull)) 387 | { 388 | /* Keep fetching until fetch width or ROB capacity or WriteQ are fully consumed. */ 389 | /* Read the corresponding trace file and populate the tail of the ROB data structure. */ 390 | /* If Memop, then populate read/write queue. Set up completion time. */ 391 | 392 | if (nonmemops[numc]) 393 | { /* Have some non-memory-ops to consume. */ 394 | ROB[numc].optype[ROB[numc].tail] = 'N'; 395 | ROB[numc].comptime[ROB[numc].tail] = 396 | CYCLE_VAL + PIPELINEDEPTH; 397 | nonmemops[numc]--; 398 | ROB[numc].tail = (ROB[numc].tail + 1) % ROBSIZE; 399 | ROB[numc].inflight++; 400 | fetched[numc]++; 401 | num_fetch++; 402 | } 403 | else 404 | { /* Done consuming non-memory-ops. Must now consume the memory rd or wr. */ 405 | if (opertype[numc] == 'R') 406 | { 407 | addr[numc] = addr[numc] + (long long int) ((long long int) prefixtable[numc] << (ADDRESS_BITS - log_base2 (NUMCORES))); // Add MSB bits so each trace accesses a different address space. 408 | ROB[numc].mem_address[ROB[numc].tail] = addr[numc]; 409 | ROB[numc].optype[ROB[numc].tail] = opertype[numc]; 410 | ROB[numc].comptime[ROB[numc].tail] = 411 | CYCLE_VAL + BIGNUM; 412 | ROB[numc].instrpc[ROB[numc].tail] = instrpc[numc]; 413 | 414 | // Check to see if the read is for buffered data in write queue - 415 | // return constant latency if match in WQ 416 | // add in read queue otherwise 417 | int lat = 418 | read_matches_write_or_read_queue (addr[numc]); 419 | if (lat) 420 | { 421 | ROB[numc].comptime[ROB[numc].tail] = 422 | CYCLE_VAL + lat + PIPELINEDEPTH; 423 | } 424 | else 425 | { 426 | insert_read (addr[numc], CYCLE_VAL, numc, 427 | ROB[numc].tail, instrpc[numc]); 428 | } 429 | } 430 | else 431 | { /* This must be a 'W'. We are confirming that while reading the trace. */ 432 | if (opertype[numc] == 'W') 433 | { 434 | addr[numc] = addr[numc] + (long long int) ((long long int) prefixtable[numc] << (ADDRESS_BITS - log_base2 (NUMCORES))); // Add MSB bits so each trace accesses a different address space. 435 | ROB[numc].mem_address[ROB[numc].tail] = 436 | addr[numc]; 437 | ROB[numc].optype[ROB[numc].tail] = 438 | opertype[numc]; 439 | ROB[numc].comptime[ROB[numc].tail] = 440 | CYCLE_VAL + PIPELINEDEPTH; 441 | /* Also, add this to the write queue. */ 442 | 443 | if (!write_exists_in_write_queue (addr[numc])) 444 | insert_write (addr[numc], CYCLE_VAL, numc, 445 | ROB[numc].tail); 446 | 447 | for (int c = 0; c < NUM_CHANNELS; c++) 448 | { 449 | if (write_queue_length[c] == WQ_CAPACITY) 450 | { 451 | writeqfull = 1; 452 | break; 453 | } 454 | } 455 | } 456 | else 457 | { 458 | printf ("Panic. Poor trace format. \n"); 459 | return -1; 460 | } 461 | } 462 | ROB[numc].tail = (ROB[numc].tail + 1) % ROBSIZE; 463 | ROB[numc].inflight++; 464 | fetched[numc]++; 465 | num_fetch++; 466 | 467 | /* Done consuming one line of the trace file. Read in the next. */ 468 | if (fgets (newstr, MAXTRACELINESIZE, tif[numc])) 469 | { 470 | if (sscanf 471 | (newstr, "%d %c", &nonmemops[numc], 472 | &opertype[numc]) > 0) 473 | { 474 | if (opertype[numc] == 'R') 475 | { 476 | if (sscanf 477 | (newstr, "%d %c %Lx %Lx", 478 | &nonmemops[numc], &opertype[numc], 479 | &addr[numc], &instrpc[numc]) < 1) 480 | { 481 | printf ("Panic. Poor trace format.\n"); 482 | return -4; 483 | } 484 | } 485 | else 486 | { 487 | if (opertype[numc] == 'W') 488 | { 489 | if (sscanf 490 | (newstr, "%d %c %Lx", 491 | &nonmemops[numc], &opertype[numc], 492 | &addr[numc]) < 1) 493 | { 494 | printf 495 | ("Panic. Poor trace format.\n"); 496 | return -3; 497 | } 498 | } 499 | else 500 | { 501 | printf ("Panic. Poor trace format.\n"); 502 | return -2; 503 | } 504 | } 505 | } 506 | else 507 | { 508 | printf ("Panic. Poor trace format.\n"); 509 | return -1; 510 | } 511 | } 512 | else 513 | { 514 | if (ROB[numc].inflight == 0) 515 | { 516 | num_done++; 517 | if (!time_done[numc]) 518 | time_done[numc] = CYCLE_VAL; 519 | } 520 | ROB[numc].tracedone = 1; 521 | break; /* Break out of the while loop fetching instructions. */ 522 | } 523 | 524 | } /* Done consuming the next rd or wr. */ 525 | 526 | } /* One iteration of the fetch while loop done. */ 527 | } /* Closing brace for if(trace not done). */ 528 | else 529 | { /* Input trace is done. Check to see if all inflight instrs have finished. */ 530 | if (ROB[numc].inflight == 0) 531 | { 532 | num_done++; 533 | if (!time_done[numc]) 534 | time_done[numc] = CYCLE_VAL; 535 | } 536 | } 537 | } /* End of for loop that goes through all cores. */ 538 | 539 | 540 | if (num_done == NUMCORES) 541 | { 542 | /* Traces have been consumed and in-flight windows are empty. Must confirm that write queues have been drained. */ 543 | for (numch = 0; numch < NUM_CHANNELS; numch++) 544 | { 545 | if (write_queue_length[numch]) 546 | break; 547 | } 548 | if (numch == NUM_CHANNELS) 549 | expt_done = 1; /* All traces have been consumed and the write queues are drained. */ 550 | } 551 | 552 | /* Printing details for testing. Remove later. */ 553 | //printf("Cycle: %lld\n", CYCLE_VAL); 554 | //for (numc=0; numc < NUMCORES; numc++) { 555 | // printf("C%d: Inf %d : Hd %d : Tl %d : Comp %lld : type %c : addr %x : TD %d\n", numc, ROB[numc].inflight, ROB[numc].head, ROB[numc].tail, ROB[numc].comptime[ROB[numc].head], ROB[numc].optype[ROB[numc].head], ROB[numc].mem_address[ROB[numc].head], ROB[numc].tracedone); 556 | //} 557 | 558 | CYCLE_VAL++; /* Advance the simulation cycle. */ 559 | } 560 | 561 | 562 | /* Code to make sure that the write queue drain time is included in 563 | the execution time of the thread that finishes last. */ 564 | maxtd = time_done[0]; 565 | maxcr = 0; 566 | for (numc = 1; numc < NUMCORES; numc++) 567 | { 568 | if (time_done[numc] > maxtd) 569 | { 570 | maxtd = time_done[numc]; 571 | maxcr = numc; 572 | } 573 | } 574 | time_done[maxcr] = CYCLE_VAL; 575 | 576 | core_power = 0; 577 | for (numc = 0; numc < NUMCORES; numc++) 578 | { 579 | /* A core has peak power of 10 W in a 4-channel config. Peak power is consumed while the thread is running, else the core is perfectly power gated. */ 580 | core_power = 581 | core_power + (10 * ((float) time_done[numc] / (float) CYCLE_VAL)); 582 | } 583 | if (NUM_CHANNELS == 1) 584 | { 585 | /* The core is more energy-efficient in our single-channel configuration. */ 586 | core_power = core_power / 2.0; 587 | } 588 | 589 | 590 | 591 | printf ("Done with loop. Printing stats.\n"); 592 | printf ("Cycles %lld\n", CYCLE_VAL); 593 | total_time_done = 0; 594 | for (numc = 0; numc < NUMCORES; numc++) 595 | { 596 | printf 597 | ("Done: Core %d: Fetched %lld : Committed %lld : At time : %lld\n", 598 | numc, fetched[numc], committed[numc], time_done[numc]); 599 | total_time_done += time_done[numc]; 600 | } 601 | printf ("Sum of execution times for all programs: %lld\n", total_time_done); 602 | printf ("Num reads merged: %lld\n", num_read_merge); 603 | printf ("Num writes merged: %lld\n", num_write_merge); 604 | /* Print all other memory system stats. */ 605 | scheduler_stats (); 606 | print_stats (); 607 | 608 | /*Print Cycle Stats */ 609 | for (int c = 0; c < NUM_CHANNELS; c++) 610 | for (int r = 0; r < NUM_RANKS; r++) 611 | calculate_power (c, r, 0, chips_per_rank); 612 | 613 | printf 614 | ("\n#-------------------------------------- Power Stats ----------------------------------------------\n"); 615 | printf 616 | ("Note: 1. termRoth/termWoth is the power dissipated in the ODT resistors when Read/Writes terminate \n"); 617 | printf (" in other ranks on the same channel\n"); 618 | printf 619 | ("#-------------------------------------------------------------------------------------------------\n\n"); 620 | 621 | 622 | /*Print Power Stats */ 623 | float total_system_power = 0; 624 | for (int c = 0; c < NUM_CHANNELS; c++) 625 | for (int r = 0; r < NUM_RANKS; r++) 626 | total_system_power += calculate_power (c, r, 1, chips_per_rank); 627 | 628 | printf 629 | ("\n#-------------------------------------------------------------------------------------------------\n"); 630 | if (NUM_CHANNELS == 4) 631 | { /* Assuming that this is 4channel.cfg */ 632 | printf ("Total memory system power = %f W\n", 633 | total_system_power / 1000); 634 | printf 635 | ("Miscellaneous system power = 40 W # Processor uncore power, disk, I/O, cooling, etc.\n"); 636 | printf 637 | ("Processor core power = %f W # Assuming that each core consumes 10 W when running\n", 638 | core_power); 639 | printf ("Total system power = %f W # Sum of the previous three lines\n", 640 | 40 + core_power + total_system_power / 1000); 641 | printf ("Energy Delay product (EDP) = %2.9f J.s\n", 642 | (40 + core_power + 643 | total_system_power / 1000) * (float) ((double) CYCLE_VAL / 644 | (double) 3200000000) * 645 | (float) ((double) CYCLE_VAL / (double) 3200000000)); 646 | } 647 | else 648 | { /* Assuming that this is 1channel.cfg */ 649 | printf ("Total memory system power = %f W\n", 650 | total_system_power / 1000); 651 | printf ("Miscellaneous system power = 10 W # Processor uncore power, disk, I/O, cooling, etc.\n"); /* The total 40 W misc power will be split across 4 channels, only 1 of which is being considered in the 1-channel experiment. */ 652 | printf ("Processor core power = %f W # Assuming that each core consumes 5 W\n", core_power); /* Assuming that the cores are more lightweight. */ 653 | printf ("Total system power = %f W # Sum of the previous three lines\n", 654 | 10 + core_power + total_system_power / 1000); 655 | printf ("Energy Delay product (EDP) = %2.9f J.s\n", 656 | (10 + core_power + 657 | total_system_power / 1000) * (float) ((double) CYCLE_VAL / 658 | (double) 3200000000) * 659 | (float) ((double) CYCLE_VAL / (double) 3200000000)); 660 | } 661 | 662 | return 0; 663 | } 664 | -------------------------------------------------------------------------------- /src/memory_controller.h: -------------------------------------------------------------------------------- 1 | #ifndef __MEMORY_CONTROLLER_H__ 2 | #define __MEMORY_CONTROLLER_H__ 3 | 4 | #define MAX_NUM_CHANNELS 16 5 | #define MAX_NUM_RANKS 16 6 | #define MAX_NUM_BANKS 32 7 | 8 | // Moved here from main.c 9 | long long int *committed; // total committed instructions in each core 10 | long long int *fetched; // total fetched instructions in each core 11 | 12 | 13 | ////////////////////////////////////////////////// 14 | // Memory Controller Data Structures // 15 | ////////////////////////////////////////////////// 16 | 17 | // DRAM Address Structure 18 | typedef struct draddr 19 | { 20 | long long int actual_address; // physical_address being accessed 21 | int channel; // channel id 22 | int rank; // rank id 23 | int bank; // bank id 24 | long long int row; // row/page id 25 | int column; // column id 26 | } dram_address_t; 27 | 28 | // DRAM Commands 29 | typedef enum {ACT_CMD, COL_READ_CMD, PRE_CMD, COL_WRITE_CMD, PWR_DN_SLOW_CMD, PWR_DN_FAST_CMD, PWR_UP_CMD, REF_CMD, NOP} command_t; 30 | 31 | // Request Types 32 | typedef enum {READ, WRITE} optype_t; 33 | 34 | // Single request structure self-explanatory 35 | typedef struct req 36 | { 37 | unsigned long long int physical_address; 38 | dram_address_t dram_addr; 39 | long long int arrival_time; 40 | long long int dispatch_time; // when COL_RD or COL_WR is issued for this request 41 | long long int completion_time; //final completion time 42 | long long int latency; // dispatch_time-arrival_time 43 | int thread_id; // core that issued this request 44 | command_t next_command; // what command needs to be issued to make forward progress with this request 45 | int command_issuable; // can this request be issued in the current cycle 46 | optype_t operation_type; // Read/Write 47 | int request_served; // if request has it's final command issued or not 48 | int instruction_id; // 0 to ROBSIZE-1 49 | long long int instruction_pc; // phy address of instruction that generated this request (valid only for reads) 50 | void * user_ptr; // user_specified data 51 | struct req * next; 52 | } request_t; 53 | 54 | // Bankstates 55 | typedef enum 56 | { 57 | IDLE, PRECHARGING, REFRESHING, ROW_ACTIVE, PRECHARGE_POWER_DOWN_FAST, PRECHARGE_POWER_DOWN_SLOW, ACTIVE_POWER_DOWN 58 | } bankstate_t; 59 | 60 | // Structure to hold the state of a bank 61 | typedef struct bnk 62 | { 63 | bankstate_t state; 64 | long long int active_row; 65 | long long int next_pre; 66 | long long int next_act; 67 | long long int next_read; 68 | long long int next_write; 69 | long long int next_powerdown; 70 | long long int next_powerup; 71 | long long int next_refresh; 72 | }bank_t; 73 | 74 | // contains the states of all banks in the system 75 | bank_t dram_state[MAX_NUM_CHANNELS][MAX_NUM_RANKS][MAX_NUM_BANKS]; 76 | 77 | // command issued this cycle to this channel 78 | int command_issued_current_cycle[MAX_NUM_CHANNELS]; 79 | 80 | // cas command issued this cycle to this channel 81 | int cas_issued_current_cycle[MAX_NUM_CHANNELS][MAX_NUM_RANKS][MAX_NUM_BANKS]; // 1/2 for COL_READ/COL_WRITE 82 | 83 | // Per channel read queue 84 | request_t * read_queue_head[MAX_NUM_CHANNELS]; 85 | 86 | // Per channel write queue 87 | request_t * write_queue_head[MAX_NUM_CHANNELS]; 88 | 89 | // issuables_for_different commands 90 | int cmd_precharge_issuable[MAX_NUM_CHANNELS][MAX_NUM_RANKS][MAX_NUM_BANKS]; 91 | int cmd_all_bank_precharge_issuable[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 92 | int cmd_powerdown_fast_issuable[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 93 | int cmd_powerdown_slow_issuable[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 94 | int cmd_powerup_issuable[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 95 | int cmd_refresh_issuable[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 96 | 97 | 98 | // refresh variables 99 | long long int next_refresh_completion_deadline[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 100 | long long int last_refresh_completion_deadline[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 101 | int forced_refresh_mode_on[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 102 | int refresh_issue_deadline[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 103 | int issued_forced_refresh_commands[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 104 | int num_issued_refreshes[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 105 | 106 | long long int read_queue_length[MAX_NUM_CHANNELS]; 107 | long long int write_queue_length[MAX_NUM_CHANNELS]; 108 | 109 | // Stats 110 | long long int num_read_merge ; 111 | long long int num_write_merge ; 112 | long long int stats_reads_merged_per_channel[MAX_NUM_CHANNELS]; 113 | long long int stats_writes_merged_per_channel[MAX_NUM_CHANNELS]; 114 | long long int stats_reads_seen[MAX_NUM_CHANNELS]; 115 | long long int stats_writes_seen[MAX_NUM_CHANNELS]; 116 | long long int stats_reads_completed[MAX_NUM_CHANNELS]; 117 | long long int stats_writes_completed[MAX_NUM_CHANNELS]; 118 | 119 | double stats_average_read_latency[MAX_NUM_CHANNELS]; 120 | double stats_average_read_queue_latency[MAX_NUM_CHANNELS]; 121 | double stats_average_write_latency[MAX_NUM_CHANNELS]; 122 | double stats_average_write_queue_latency[MAX_NUM_CHANNELS]; 123 | 124 | long long int stats_page_hits[MAX_NUM_CHANNELS]; 125 | double stats_read_row_hit_rate[MAX_NUM_CHANNELS]; 126 | 127 | // Time spent in various states 128 | long long int stats_time_spent_in_active_standby[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 129 | long long int stats_time_spent_in_active_power_down[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 130 | long long int stats_time_spent_in_precharge_power_down_fast[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 131 | long long int stats_time_spent_in_precharge_power_down_slow[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 132 | long long int stats_time_spent_in_power_up[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 133 | long long int last_activate[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 134 | long long int last_refresh[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 135 | double average_gap_between_activates[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 136 | double average_gap_between_refreshes[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 137 | long long int stats_time_spent_terminating_reads_from_other_ranks[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 138 | long long int stats_time_spent_terminating_writes_to_other_ranks[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 139 | 140 | // Command Counters 141 | long long int stats_num_activate_read[MAX_NUM_CHANNELS][MAX_NUM_RANKS][MAX_NUM_BANKS]; 142 | long long int stats_num_activate_write[MAX_NUM_CHANNELS][MAX_NUM_RANKS][MAX_NUM_BANKS]; 143 | long long int stats_num_activate_spec[MAX_NUM_CHANNELS][MAX_NUM_RANKS][MAX_NUM_BANKS]; 144 | long long int stats_num_activate[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 145 | long long int stats_num_precharge[MAX_NUM_CHANNELS][MAX_NUM_RANKS][MAX_NUM_BANKS]; 146 | long long int stats_num_read[MAX_NUM_CHANNELS][MAX_NUM_RANKS][MAX_NUM_BANKS]; 147 | long long int stats_num_write[MAX_NUM_CHANNELS][MAX_NUM_RANKS][MAX_NUM_BANKS]; 148 | long long int stats_num_powerdown_slow[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 149 | long long int stats_num_powerdown_fast[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 150 | long long int stats_num_powerup[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 151 | 152 | 153 | 154 | // functions 155 | 156 | // to get log with base 2 157 | unsigned int log_base2(unsigned int new_value); 158 | 159 | // initialize memory_controller variables 160 | void init_memory_controller_vars(); 161 | 162 | // called every cycle to update the read/write queues 163 | void update_memory(); 164 | 165 | // activate to bank allowed or not 166 | int is_activate_allowed(int channel, int rank, int bank); 167 | 168 | // precharge to bank allowed or not 169 | int is_precharge_allowed(int channel, int rank, int bank); 170 | 171 | // all bank precharge allowed or not 172 | int is_all_bank_precharge_allowed(int channel, int rank); 173 | 174 | // autoprecharge allowed or not 175 | int is_autoprecharge_allowed(int channel,int rank,int bank); 176 | 177 | // power_down fast allowed or not 178 | int is_powerdown_fast_allowed(int channel,int rank); 179 | 180 | // power_down slow allowed or not 181 | int is_powerdown_slow_allowed(int channel,int rank); 182 | 183 | // powerup allowed or not 184 | int is_powerup_allowed(int channel,int rank); 185 | 186 | // refresh allowed or not 187 | int is_refresh_allowed(int channel,int rank); 188 | 189 | 190 | // issues command to make progress on a request 191 | int issue_request_command(request_t * req); 192 | 193 | // power_down command 194 | int issue_powerdown_command(int channel, int rank, command_t cmd); 195 | 196 | // powerup command 197 | int issue_powerup_command(int channel, int rank); 198 | 199 | // precharge a bank 200 | int issue_activate_command(int channel, int rank, int bank, long long int row); 201 | 202 | // precharge a bank 203 | int issue_precharge_command(int channel, int rank, int bank); 204 | 205 | // precharge all banks in a rank 206 | int issue_all_bank_precharge_command(int channel, int rank); 207 | 208 | // refresh all banks 209 | int issue_refresh_command(int channel, int rank); 210 | 211 | // autoprecharge all banks 212 | int issue_autoprecharge(int channel, int rank, int bank); 213 | 214 | // find if there is a matching write request 215 | int read_matches_write_or_read_queue(long long int physical_address); 216 | 217 | // find if there is a matching request in the write queue 218 | int write_exists_in_write_queue(long long int physical_address); 219 | 220 | // enqueue a read into the corresponding read queue (returns ptr to new node) 221 | request_t* insert_read(long long int physical_address, long long int arrival_cycle, int thread_id, int instruction_id, long long int instruction_pc); 222 | 223 | // enqueue a write into the corresponding write queue (returns ptr to new_node) 224 | request_t* insert_write(long long int physical_address, long long int arrival_time, int thread_id, int instruction_id); 225 | 226 | // update stats counters 227 | void gather_stats(int channel); 228 | 229 | // print statistics 230 | void print_stats(); 231 | 232 | // calculate power for each channel 233 | float calculate_power(int channel, int rank, int print_stats_type, int chips_per_rank); 234 | #endif // __MEM_CONTROLLER_HH__ 235 | -------------------------------------------------------------------------------- /src/params.h: -------------------------------------------------------------------------------- 1 | #ifndef __PARAMS_H__ 2 | #define __PARAMS_H__ 3 | 4 | /********************/ 5 | /* Processor params */ 6 | /********************/ 7 | // number of cores in mulicore 8 | int NUMCORES; 9 | 10 | // processor clock frequency multiplier : multiplying the 11 | // DRAM_CLK_FREQUENCY by the following parameter gives the processor 12 | // clock frequency 13 | int PROCESSOR_CLK_MULTIPLIER; 14 | 15 | //size of ROB 16 | int ROBSIZE ;// 128; 17 | 18 | // maximum commit width 19 | int MAX_RETIRE ;// 2; 20 | 21 | // maximum instruction fetch width 22 | int MAX_FETCH ;// 4; 23 | 24 | // depth of pipeline 25 | int PIPELINEDEPTH ;// 5; 26 | 27 | 28 | /*****************************/ 29 | /* DRAM System Configuration */ 30 | /*****************************/ 31 | // total number of channels in the system 32 | int NUM_CHANNELS ;// 1; 33 | 34 | // number of ranks per channel 35 | int NUM_RANKS ;// 2; 36 | 37 | // number of banks per rank 38 | int NUM_BANKS ;// 8; 39 | 40 | // number of rows per bank 41 | int NUM_ROWS ;// 32768; 42 | 43 | // number of columns per rank 44 | int NUM_COLUMNS ;// 128; 45 | 46 | // cache-line size (bytes) 47 | int CACHE_LINE_SIZE ;// 64; 48 | 49 | // total number of address bits (i.e. indicates size of memory) 50 | int ADDRESS_BITS ;// 32; 51 | 52 | /****************************/ 53 | /* DRAM Chip Specifications */ 54 | /****************************/ 55 | 56 | // dram frequency (not datarate) in MHz 57 | int DRAM_CLK_FREQUENCY ;// 800; 58 | 59 | // All the following timing parameters should be 60 | // entered in the config file in terms of memory 61 | // clock cycles. 62 | 63 | // RAS to CAS delay 64 | int T_RCD ;// 44; 65 | 66 | // PRE to RAS 67 | int T_RP ;// 44; 68 | 69 | // ColumnRD to Data burst 70 | int T_CAS ;// 44; 71 | 72 | // RAS to PRE delay 73 | int T_RAS ;// 112; 74 | 75 | // Row Cycle time 76 | int T_RC ;// 156; 77 | 78 | // ColumnWR to Data burst 79 | int T_CWD ;// 20; 80 | 81 | // write recovery time (COL_WR to PRE) 82 | int T_WR ;// 48; 83 | 84 | // write to read turnaround 85 | int T_WTR ;// 24; 86 | 87 | // rank to rank switching time 88 | int T_RTRS ;// 8; 89 | 90 | // Data transfer 91 | int T_DATA_TRANS ;// 16; 92 | 93 | // Read to PRE 94 | int T_RTP ;// 24; 95 | 96 | // CAS to CAS 97 | int T_CCD ;// 16; 98 | 99 | // Power UP time fast 100 | int T_XP ;// 20; 101 | 102 | // Power UP time slow 103 | int T_XP_DLL ;// 40; 104 | 105 | // Power down entry 106 | int T_CKE ;// 16; 107 | 108 | // Minimum power down duration 109 | int T_PD_MIN ;// 16; 110 | 111 | // rank to rank delay (ACTs to same rank) 112 | int T_RRD ;// 20; 113 | 114 | // four bank activation window 115 | int T_FAW ;// 128; 116 | 117 | // refresh interval 118 | int T_REFI; 119 | 120 | // refresh cycle time 121 | int T_RFC; 122 | 123 | /****************************/ 124 | /* VOLTAGE & CURRENT VALUES */ 125 | /****************************/ 126 | 127 | float VDD; 128 | 129 | float IDD0; 130 | 131 | float IDD1; 132 | 133 | float IDD2P0; 134 | 135 | float IDD2P1; 136 | 137 | float IDD2N; 138 | 139 | float IDD3P; 140 | 141 | float IDD3N; 142 | 143 | float IDD4R; 144 | 145 | float IDD4W; 146 | 147 | float IDD5; 148 | 149 | /******************************/ 150 | /* MEMORY CONTROLLER Settings */ 151 | /******************************/ 152 | 153 | // maximum capacity of write queue (per channel) 154 | int WQ_CAPACITY ;// 64; 155 | 156 | // int ADDRESS_MAPPING mode 157 | // 1 is consecutive cache-lines to same row 158 | // 2 is consecutive cache-lines striped across different banks 159 | int ADDRESS_MAPPING ;// 1; 160 | 161 | // WQ associative lookup 162 | int WQ_LOOKUP_LATENCY; 163 | 164 | 165 | #endif // __PARAMS_H__ 166 | 167 | -------------------------------------------------------------------------------- /src/processor.h: -------------------------------------------------------------------------------- 1 | #ifndef __PROCESSOR_H__ 2 | #define __PROCESSOR_H__ 3 | 4 | struct robstructure 5 | { 6 | int head; 7 | int tail; 8 | int inflight; 9 | long long int * comptime; 10 | long long int * mem_address; 11 | int * optype; 12 | long long int * instrpc; 13 | int tracedone; 14 | } ; 15 | 16 | #endif //__PROCESSOR_H__ 17 | 18 | -------------------------------------------------------------------------------- /src/scheduler-close.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include "utlist.h" 3 | #include "utils.h" 4 | 5 | #include "memory_controller.h" 6 | #include "params.h" 7 | 8 | /* A basic FCFS policy augmented with a not-so-clever close-page policy. 9 | If the memory controller is unable to issue a command this cycle, find 10 | a bank that recently serviced a column-rd/wr and close it (precharge it). */ 11 | 12 | 13 | extern long long int CYCLE_VAL; 14 | 15 | /* A data structure to see if a bank is a candidate for precharge. */ 16 | int recent_colacc[MAX_NUM_CHANNELS][MAX_NUM_RANKS][MAX_NUM_BANKS]; 17 | 18 | /* Keeping track of how many preemptive precharges are performed. */ 19 | long long int num_aggr_precharge = 0; 20 | 21 | void 22 | init_scheduler_vars () 23 | { 24 | // initialize all scheduler variables here 25 | int i, j, k; 26 | for (i = 0; i < MAX_NUM_CHANNELS; i++) 27 | { 28 | for (j = 0; j < MAX_NUM_RANKS; j++) 29 | { 30 | for (k = 0; k < MAX_NUM_BANKS; k++) 31 | { 32 | recent_colacc[i][j][k] = 0; 33 | } 34 | } 35 | } 36 | 37 | return; 38 | } 39 | 40 | // write queue high water mark; begin draining writes if write queue exceeds this value 41 | #define HI_WM 40 42 | 43 | // end write queue drain once write queue has this many writes in it 44 | #define LO_WM 20 45 | 46 | // 1 means we are in write-drain mode for that channel 47 | int drain_writes[MAX_NUM_CHANNELS]; 48 | 49 | /* Each cycle it is possible to issue a valid command from the read or write queues 50 | OR 51 | a valid precharge command to any bank (issue_precharge_command()) 52 | OR 53 | a valid precharge_all bank command to a rank (issue_all_bank_precharge_command()) 54 | OR 55 | a power_down command (issue_powerdown_command()), programmed either for fast or slow exit mode 56 | OR 57 | a refresh command (issue_refresh_command()) 58 | OR 59 | a power_up command (issue_powerup_command()) 60 | OR 61 | an activate to a specific row (issue_activate_command()). 62 | 63 | If a COL-RD or COL-WR is picked for issue, the scheduler also has the 64 | option to issue an auto-precharge in this cycle (issue_autoprecharge()). 65 | 66 | Before issuing a command it is important to check if it is issuable. For the RD/WR queue resident commands, checking the "command_issuable" flag is necessary. To check if the other commands (mentioned above) can be issued, it is important to check one of the following functions: is_precharge_allowed, is_all_bank_precharge_allowed, is_powerdown_fast_allowed, is_powerdown_slow_allowed, is_powerup_allowed, is_refresh_allowed, is_autoprecharge_allowed, is_activate_allowed. 67 | */ 68 | 69 | 70 | void 71 | schedule (int channel) 72 | { 73 | request_t *rd_ptr = NULL; 74 | request_t *wr_ptr = NULL; 75 | int i, j; 76 | 77 | 78 | // if in write drain mode, keep draining writes until the 79 | // write queue occupancy drops to LO_WM 80 | if (drain_writes[channel] && (write_queue_length[channel] > LO_WM)) 81 | { 82 | drain_writes[channel] = 1; // Keep draining. 83 | } 84 | else 85 | { 86 | drain_writes[channel] = 0; // No need to drain. 87 | } 88 | 89 | // initiate write drain if either the write queue occupancy 90 | // has reached the HI_WM , OR, if there are no pending read 91 | // requests 92 | if (write_queue_length[channel] > HI_WM) 93 | { 94 | drain_writes[channel] = 1; 95 | } 96 | else 97 | { 98 | if (!read_queue_length[channel]) 99 | drain_writes[channel] = 1; 100 | } 101 | 102 | 103 | // If in write drain mode, look through all the write queue 104 | // elements (already arranged in the order of arrival), and 105 | // issue the command for the first request that is ready 106 | if (drain_writes[channel]) 107 | { 108 | 109 | LL_FOREACH (write_queue_head[channel], wr_ptr) 110 | { 111 | if (wr_ptr->command_issuable) 112 | { 113 | /* Before issuing the command, see if this bank is now a candidate for closure (if it just did a column-rd/wr). 114 | If the bank just did an activate or precharge, it is not a candidate for closure. */ 115 | if (wr_ptr->next_command == COL_WRITE_CMD) 116 | { 117 | recent_colacc[channel][wr_ptr->dram_addr.rank][wr_ptr-> 118 | dram_addr. 119 | bank] = 1; 120 | } 121 | if (wr_ptr->next_command == ACT_CMD) 122 | { 123 | recent_colacc[channel][wr_ptr->dram_addr.rank][wr_ptr-> 124 | dram_addr. 125 | bank] = 0; 126 | } 127 | if (wr_ptr->next_command == PRE_CMD) 128 | { 129 | recent_colacc[channel][wr_ptr->dram_addr.rank][wr_ptr-> 130 | dram_addr. 131 | bank] = 0; 132 | } 133 | issue_request_command (wr_ptr); 134 | break; 135 | } 136 | } 137 | } 138 | 139 | // Draining Reads 140 | // look through the queue and find the first request whose 141 | // command can be issued in this cycle and issue it 142 | // Simple FCFS 143 | if (!drain_writes[channel]) 144 | { 145 | LL_FOREACH (read_queue_head[channel], rd_ptr) 146 | { 147 | if (rd_ptr->command_issuable) 148 | { 149 | /* Before issuing the command, see if this bank is now a candidate for closure (if it just did a column-rd/wr). 150 | If the bank just did an activate or precharge, it is not a candidate for closure. */ 151 | if (rd_ptr->next_command == COL_READ_CMD) 152 | { 153 | recent_colacc[channel][rd_ptr->dram_addr.rank][rd_ptr-> 154 | dram_addr. 155 | bank] = 1; 156 | } 157 | if (rd_ptr->next_command == ACT_CMD) 158 | { 159 | recent_colacc[channel][rd_ptr->dram_addr.rank][rd_ptr-> 160 | dram_addr. 161 | bank] = 0; 162 | } 163 | if (rd_ptr->next_command == PRE_CMD) 164 | { 165 | recent_colacc[channel][rd_ptr->dram_addr.rank][rd_ptr-> 166 | dram_addr. 167 | bank] = 0; 168 | } 169 | issue_request_command (rd_ptr); 170 | break; 171 | } 172 | } 173 | } 174 | 175 | /* If a command hasn't yet been issued to this channel in this cycle, issue a precharge. */ 176 | if (!command_issued_current_cycle[channel]) 177 | { 178 | for (i = 0; i < NUM_RANKS; i++) 179 | { 180 | for (j = 0; j < NUM_BANKS; j++) 181 | { /* For all banks on the channel.. */ 182 | if (recent_colacc[channel][i][j]) 183 | { /* See if this bank is a candidate. */ 184 | if (is_precharge_allowed (channel, i, j)) 185 | { /* See if precharge is doable. */ 186 | if (issue_precharge_command (channel, i, j)) 187 | { 188 | num_aggr_precharge++; 189 | recent_colacc[channel][i][j] = 0; 190 | } 191 | } 192 | } 193 | } 194 | } 195 | } 196 | 197 | 198 | } 199 | 200 | void 201 | scheduler_stats () 202 | { 203 | /* Nothing to print for now. */ 204 | printf ("Number of aggressive precharges: %lld\n", num_aggr_precharge); 205 | } 206 | -------------------------------------------------------------------------------- /src/scheduler-close.h: -------------------------------------------------------------------------------- 1 | #ifndef __SCHEDULER_H__ 2 | #define __SCHEDULER_H__ 3 | 4 | void init_scheduler_vars(); //called from main 5 | void scheduler_stats(); //called from main 6 | void schedule(int); // scheduler function called every cycle 7 | 8 | #endif //__SCHEDULER_H__ 9 | 10 | -------------------------------------------------------------------------------- /src/scheduler-fair.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include "utlist.h" 3 | #include "utils.h" 4 | 5 | #include "memory_controller.h" 6 | 7 | #define MAX_THREADS 100 8 | #define MAX_CREDITS 1024 9 | 10 | extern long long int CYCLE_VAL; 11 | 12 | // currency used to arbitrate data bus usage between threads 13 | int dbus_credits[MAX_NUM_CHANNELS][MAX_THREADS]; 14 | 15 | // used to make sure each thread gets one dbus_credit per cycle 16 | long long int last_cycle_credited[MAX_NUM_CHANNELS]; 17 | 18 | // fair scheduler stats 19 | long long int count_col_read[MAX_NUM_CHANNELS][MAX_THREADS]; 20 | long long int credits_at_read[MAX_NUM_CHANNELS][MAX_THREADS]; 21 | 22 | void 23 | init_scheduler_vars () 24 | { 25 | // initialize all scheduler variables here 26 | 27 | int i, j; 28 | for (i = 0; i < MAX_NUM_CHANNELS; i++) 29 | { 30 | for (j = 0; j < MAX_THREADS; j++) 31 | { 32 | // all threads start out with maximum credits 33 | dbus_credits[i][j] = MAX_CREDITS; 34 | count_col_read[i][j] = 0; 35 | credits_at_read[i][j] = 0; 36 | } 37 | } 38 | 39 | for (i = 0; i < MAX_NUM_CHANNELS; i++) 40 | { 41 | last_cycle_credited[i] = CYCLE_VAL; 42 | } 43 | 44 | return; 45 | } 46 | 47 | // write queue high water mark; begin draining writes if write queue exceeds this value 48 | #define HI_WM 40 49 | 50 | // end write queue drain once write queue has this many writes in it 51 | #define LO_WM 20 52 | 53 | // when switching to write drain mode, write at least this many times before switching back to read mode 54 | #define MIN_WRITES_ONCE_WRITING_HAS_BEGUN 1 55 | 56 | // 1 means we are in write-drain mode for that channel 57 | int drain_writes[MAX_NUM_CHANNELS]; 58 | 59 | // how many writes have been performed since beginning current write drain 60 | int writes_done_this_drain[MAX_NUM_CHANNELS]; 61 | 62 | // flag saying that we're only draining the write queue because there are no reads to schedule 63 | int draining_writes_due_to_rq_empty[MAX_NUM_CHANNELS]; 64 | 65 | /* Each cycle it is possible to issue a valid command from the read or write queues 66 | OR 67 | a valid precharge command to any bank (issue_precharge_command()) 68 | OR 69 | a valid precharge_all bank command to a rank (issue_all_bank_precharge_command()) 70 | OR 71 | a power_down command (issue_powerdown_command()), programmed either for fast or slow exit mode 72 | OR 73 | a refresh command (issue_refresh_command()) 74 | OR 75 | a power_up command (issue_powerup_command()) 76 | OR 77 | an activate to a specific row (issue_activate_command()). 78 | 79 | If a COL-RD or COL-WR is picked for issue, the scheduler also has the 80 | option to issue an auto-precharge in this cycle (issue_autoprecharge()). 81 | 82 | Before issuing a command it is important to check if it is issuable. For the RD/WR queue resident commands, checking the "command_issuable" flag is necessary. To check if the other commands (mentioned above) can be issued, it is important to check one of the following functions: is_precharge_allowed, is_all_bank_precharge_allowed, is_powerdown_fast_allowed, is_powerdown_slow_allowed, is_powerup_allowed, is_refresh_allowed, is_autoprecharge_allowed, is_activate_allowed. 83 | */ 84 | 85 | 86 | void 87 | schedule (int channel) 88 | { 89 | // increase all threads' credits by one credit per cycle of simulation 90 | long long int since_last_credit = CYCLE_VAL - last_cycle_credited[channel]; 91 | last_cycle_credited[channel] = CYCLE_VAL; 92 | int i; 93 | for (i = 0; i < MAX_THREADS; i++) 94 | { 95 | dbus_credits[channel][i] += (int) since_last_credit; 96 | 97 | // just to make sure that dbus_credits is in the right range even if last_cycle_credit makes it overflow 98 | if ((dbus_credits[channel][i] < 0) 99 | || (dbus_credits[channel][i] > MAX_CREDITS)) 100 | { 101 | dbus_credits[channel][i] = MAX_CREDITS; 102 | } 103 | } 104 | 105 | request_t *rd_ptr = NULL; 106 | request_t *wr_ptr = NULL; 107 | 108 | // begin write drain if we're above the high water mark 109 | if ((write_queue_length[channel] > HI_WM) && (!drain_writes[channel])) 110 | { 111 | drain_writes[channel] = 1; 112 | writes_done_this_drain[channel] = 0; 113 | } 114 | 115 | // also begin write drain if read queue is empty 116 | if ((read_queue_length[channel] < 1) && (write_queue_length[channel] > 0) 117 | && (!drain_writes[channel])) 118 | { 119 | drain_writes[channel] = 1; 120 | writes_done_this_drain[channel] = 0; 121 | draining_writes_due_to_rq_empty[channel] = 1; 122 | } 123 | 124 | // end write drain if we're below the low water mark 125 | if ((drain_writes[channel]) && (write_queue_length[channel] <= LO_WM) 126 | && (!draining_writes_due_to_rq_empty[channel])) 127 | { 128 | drain_writes[channel] = 0; 129 | } 130 | 131 | // end write drain that was due to read_queue emptiness only if at least one write has completed 132 | if ((drain_writes[channel]) && (read_queue_length[channel] > 0) 133 | && (draining_writes_due_to_rq_empty[channel]) 134 | && (writes_done_this_drain[channel] > 135 | MIN_WRITES_ONCE_WRITING_HAS_BEGUN)) 136 | { 137 | drain_writes[channel] = 0; 138 | draining_writes_due_to_rq_empty[channel] = 0; 139 | } 140 | 141 | // make sure we don't try to drain writes if there aren't any 142 | if (write_queue_length[channel] == 0) 143 | { 144 | drain_writes[channel] = 0; 145 | } 146 | 147 | // drain from write queue now 148 | if (drain_writes[channel]) 149 | { 150 | // prioritize open row hits 151 | LL_FOREACH (write_queue_head[channel], wr_ptr) 152 | { 153 | // if COL_WRITE_CMD is the next command, then that means the appropriate row must already be open 154 | if (wr_ptr->command_issuable 155 | && (wr_ptr->next_command == COL_WRITE_CMD)) 156 | { 157 | writes_done_this_drain[channel]++; 158 | issue_request_command (wr_ptr); 159 | return; 160 | } 161 | } 162 | 163 | // if no open rows, just issue any other available commands 164 | LL_FOREACH (write_queue_head[channel], wr_ptr) 165 | { 166 | if (wr_ptr->command_issuable) 167 | { 168 | issue_request_command (wr_ptr); 169 | return; 170 | } 171 | } 172 | 173 | // nothing issuable this cycle 174 | return; 175 | } 176 | 177 | // do a read 178 | 179 | // find the thread with the most credits that has an issuable read 180 | int top_credits = -1; 181 | request_t *top_read = NULL; 182 | 183 | LL_FOREACH (read_queue_head[channel], rd_ptr) 184 | { 185 | if (rd_ptr->command_issuable) 186 | { 187 | int current_credits = dbus_credits[channel][rd_ptr->thread_id]; 188 | 189 | // if it's an open row hit, COL_READ_CMD will be the next command 190 | if (rd_ptr->next_command == COL_READ_CMD) 191 | { 192 | // count this thread as having 50% more credits if it's an open-row hit 193 | current_credits += current_credits / 2; 194 | } 195 | 196 | // update the top credits seen so far 197 | if (current_credits > top_credits) 198 | { 199 | top_credits = current_credits; 200 | top_read = rd_ptr; 201 | } 202 | } 203 | } 204 | 205 | // if an issuable read was found, issue it 206 | if (top_read != NULL) 207 | { 208 | // only decrease credits when it actually does a column read (i.e., the read actually happens) 209 | if (top_read->next_command == COL_READ_CMD) 210 | { 211 | // update stats 212 | count_col_read[channel][top_read->thread_id]++; 213 | credits_at_read[channel][top_read->thread_id] += 214 | dbus_credits[channel][top_read->thread_id]; 215 | 216 | // update credits 217 | dbus_credits[channel][top_read->thread_id] /= 2; 218 | } 219 | 220 | issue_request_command (top_read); 221 | 222 | } 223 | 224 | return; 225 | } 226 | 227 | void 228 | scheduler_stats () 229 | { 230 | /* Nothing to print for now. */ 231 | 232 | printf ("Average number of credits when performing a COL_READ_CMD\n"); 233 | for (int i = 0; i < MAX_NUM_CHANNELS; i++) 234 | { 235 | if (count_col_read[i][0] == 0) 236 | { 237 | break; 238 | } 239 | 240 | printf ("Channel %d\n", i); 241 | 242 | for (int j = 0; j < MAX_THREADS; j++) 243 | { 244 | if (count_col_read[i][j] == 0) 245 | { 246 | break; 247 | } 248 | 249 | printf ("\tThread %d credits: %f\n", j, 250 | ((float) credits_at_read[i][j]) / 251 | ((float) count_col_read[i][j])); 252 | } 253 | } 254 | } 255 | -------------------------------------------------------------------------------- /src/scheduler-fair.h: -------------------------------------------------------------------------------- 1 | #ifndef __SCHEDULER_H__ 2 | #define __SCHEDULER_H__ 3 | 4 | void init_scheduler_vars(); //called from main 5 | void scheduler_stats(); //called from main 6 | void schedule(int); // scheduler function called every cycle 7 | 8 | #endif //__SCHEDULER_H__ 9 | 10 | -------------------------------------------------------------------------------- /src/scheduler-fcfs.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include "utlist.h" 3 | #include "utils.h" 4 | 5 | #include "memory_controller.h" 6 | 7 | extern long long int CYCLE_VAL; 8 | 9 | void 10 | init_scheduler_vars () 11 | { 12 | // initialize all scheduler variables here 13 | 14 | return; 15 | } 16 | 17 | // write queue high water mark; begin draining writes if write queue exceeds this value 18 | #define HI_WM 40 19 | 20 | // end write queue drain once write queue has this many writes in it 21 | #define LO_WM 20 22 | 23 | // 1 means we are in write-drain mode for that channel 24 | int drain_writes[MAX_NUM_CHANNELS]; 25 | 26 | /* Each cycle it is possible to issue a valid command from the read or write queues 27 | OR 28 | a valid precharge command to any bank (issue_precharge_command()) 29 | OR 30 | a valid precharge_all bank command to a rank (issue_all_bank_precharge_command()) 31 | OR 32 | a power_down command (issue_powerdown_command()), programmed either for fast or slow exit mode 33 | OR 34 | a refresh command (issue_refresh_command()) 35 | OR 36 | a power_up command (issue_powerup_command()) 37 | OR 38 | an activate to a specific row (issue_activate_command()). 39 | 40 | If a COL-RD or COL-WR is picked for issue, the scheduler also has the 41 | option to issue an auto-precharge in this cycle (issue_autoprecharge()). 42 | 43 | Before issuing a command it is important to check if it is issuable. For the RD/WR queue resident commands, checking the "command_issuable" flag is necessary. To check if the other commands (mentioned above) can be issued, it is important to check one of the following functions: is_precharge_allowed, is_all_bank_precharge_allowed, is_powerdown_fast_allowed, is_powerdown_slow_allowed, is_powerup_allowed, is_refresh_allowed, is_autoprecharge_allowed, is_activate_allowed. 44 | */ 45 | 46 | void 47 | schedule (int channel) 48 | { 49 | request_t *rd_ptr = NULL; 50 | request_t *wr_ptr = NULL; 51 | 52 | 53 | // if in write drain mode, keep draining writes until the 54 | // write queue occupancy drops to LO_WM 55 | if (drain_writes[channel] && (write_queue_length[channel] > LO_WM)) 56 | { 57 | drain_writes[channel] = 1; // Keep draining. 58 | } 59 | else 60 | { 61 | drain_writes[channel] = 0; // No need to drain. 62 | } 63 | 64 | // initiate write drain if either the write queue occupancy 65 | // has reached the HI_WM , OR, if there are no pending read 66 | // requests 67 | if (write_queue_length[channel] > HI_WM) 68 | { 69 | drain_writes[channel] = 1; 70 | } 71 | else 72 | { 73 | if (!read_queue_length[channel]) 74 | drain_writes[channel] = 1; 75 | } 76 | 77 | 78 | // If in write drain mode, look through all the write queue 79 | // elements (already arranged in the order of arrival), and 80 | // issue the command for the first request that is ready 81 | if (drain_writes[channel]) 82 | { 83 | 84 | LL_FOREACH (write_queue_head[channel], wr_ptr) 85 | { 86 | if (wr_ptr->command_issuable) 87 | { 88 | issue_request_command (wr_ptr); 89 | break; 90 | } 91 | } 92 | return; 93 | } 94 | 95 | // Draining Reads 96 | // look through the queue and find the first request whose 97 | // command can be issued in this cycle and issue it 98 | // Simple FCFS 99 | if (!drain_writes[channel]) 100 | { 101 | LL_FOREACH (read_queue_head[channel], rd_ptr) 102 | { 103 | if (rd_ptr->command_issuable) 104 | { 105 | issue_request_command (rd_ptr); 106 | break; 107 | } 108 | } 109 | return; 110 | } 111 | } 112 | 113 | void 114 | scheduler_stats () 115 | { 116 | /* Nothing to print for now. */ 117 | } 118 | -------------------------------------------------------------------------------- /src/scheduler-fcfs.h: -------------------------------------------------------------------------------- 1 | #ifndef __SCHEDULER_H__ 2 | #define __SCHEDULER_H__ 3 | 4 | void init_scheduler_vars(); //called from main 5 | void scheduler_stats(); //called from main 6 | void schedule(int); // scheduler function called every cycle 7 | 8 | #endif //__SCHEDULER_H__ 9 | 10 | -------------------------------------------------------------------------------- /src/scheduler-frfcfs.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include "utlist.h" 3 | #include "utils.h" 4 | 5 | #include "memory_controller.h" 6 | #include "params.h" 7 | 8 | extern long long int CYCLE_VAL; 9 | 10 | long int count_col_hits[MAX_NUM_CHANNELS][MAX_NUM_RANKS][MAX_NUM_BANKS]; 11 | 12 | void 13 | init_scheduler_vars () 14 | { 15 | // initialize all scheduler variables here 16 | 17 | /* 18 | char *cap_env = getenv("CAPN"); 19 | 20 | if (cap_env) 21 | if(sscanf(cap_env, "%d", &CAPN) == EOF) 22 | fprintf(stderr, "CAPN env variable setting failed"); 23 | */ 24 | 25 | for (int i = 0; i < MAX_NUM_CHANNELS; i++) 26 | { 27 | for (int j = 0; j < MAX_NUM_RANKS; j++) 28 | { 29 | for (int k = 0; k < MAX_NUM_BANKS; k++) 30 | { 31 | count_col_hits[i][j][k] = 0; 32 | } 33 | } 34 | } 35 | 36 | return; 37 | } 38 | 39 | // write queue high water mark; begin draining writes if write queue exceeds this value 40 | #define HI_WM 40 41 | 42 | // end write queue drain once write queue has this many writes in it 43 | #define LO_WM 20 44 | 45 | // 1 means we are in write-drain mode for that channel 46 | int drain_writes[MAX_NUM_CHANNELS]; 47 | 48 | /* Each cycle it is possible to issue a valid command from the read or write queues 49 | OR 50 | a valid precharge command to any bank (issue_precharge_command()) 51 | OR 52 | a valid precharge_all bank command to a rank (issue_all_bank_precharge_command()) 53 | OR 54 | a power_down command (issue_powerdown_command()), programmed either for fast or slow exit mode 55 | OR 56 | a refresh command (issue_refresh_command()) 57 | OR 58 | a power_up command (issue_powerup_command()) 59 | OR 60 | an activate to a specific row (issue_activate_command()). 61 | 62 | If a COL-RD or COL-WR is picked for issue, the scheduler also has the 63 | option to issue an auto-precharge in this cycle (issue_autoprecharge()). 64 | 65 | Before issuing a command it is important to check if it is issuable. For the RD/WR queue resident commands, checking the "command_issuable" flag is necessary. To check if the other commands (mentioned above) can be issued, it is important to check one of the following functions: is_precharge_allowed, is_all_bank_precharge_allowed, is_powerdown_fast_allowed, is_powerdown_slow_allowed, is_powerup_allowed, is_refresh_allowed, is_autoprecharge_allowed, is_activate_allowed. 66 | */ 67 | 68 | void 69 | schedule (int channel) 70 | { 71 | request_t *rd_ptr = NULL; 72 | request_t *wr_ptr = NULL; 73 | 74 | 75 | // if in write drain mode, keep draining writes until the 76 | // write queue occupancy drops to LO_WM 77 | if (drain_writes[channel] && (write_queue_length[channel] > LO_WM)) 78 | { 79 | drain_writes[channel] = 1; // Keep draining. 80 | } 81 | else 82 | { 83 | drain_writes[channel] = 0; // No need to drain. 84 | } 85 | 86 | // initiate write drain if either the write queue occupancy 87 | // has reached the HI_WM , OR, if there are no pending read 88 | // requests 89 | if (write_queue_length[channel] > HI_WM) 90 | { 91 | drain_writes[channel] = 1; 92 | } 93 | else 94 | { 95 | if (!read_queue_length[channel]) 96 | drain_writes[channel] = 1; 97 | } 98 | 99 | 100 | // If in write drain mode, look through all the write queue 101 | // elements (already arranged in the order of arrival), and 102 | // issue the command for the first request that is ready 103 | if (drain_writes[channel]) 104 | { 105 | // prioritize open row hits 106 | LL_FOREACH (write_queue_head[channel], wr_ptr) 107 | { 108 | // if COL_WRITE_CMD is the next command, then that means the appropriate row must already be open 109 | if (wr_ptr->command_issuable 110 | && (wr_ptr->next_command == COL_WRITE_CMD)) 111 | { 112 | count_col_hits[channel][wr_ptr->dram_addr.rank][wr_ptr->dram_addr.bank]++; 113 | issue_request_command (wr_ptr); 114 | 115 | // issue auto-precharge if possible 116 | if (count_col_hits[channel][wr_ptr->dram_addr.rank][wr_ptr->dram_addr.bank] >= CAPN && 117 | is_autoprecharge_allowed(channel, wr_ptr->dram_addr.rank, wr_ptr->dram_addr.bank)) 118 | if (issue_autoprecharge(channel, wr_ptr->dram_addr.rank, wr_ptr->dram_addr.bank)) 119 | count_col_hits[channel][wr_ptr->dram_addr.rank][wr_ptr->dram_addr.bank] = 0; 120 | 121 | return; 122 | } 123 | } 124 | 125 | LL_FOREACH (write_queue_head[channel], wr_ptr) 126 | { 127 | // if no open rows, just issue any other available commands 128 | if (wr_ptr->command_issuable) 129 | { 130 | issue_request_command (wr_ptr); 131 | count_col_hits[channel][wr_ptr->dram_addr.rank][wr_ptr->dram_addr.bank] = 0; 132 | break; 133 | } 134 | } 135 | } 136 | 137 | // Draining Reads 138 | // look through the queue and find the first request whose 139 | // command can be issued in this cycle and issue it 140 | // Simple FCFS 141 | if (!drain_writes[channel]) 142 | { 143 | LL_FOREACH (read_queue_head[channel], rd_ptr) 144 | { 145 | // if COL_WRITE_CMD is the next command, then that means the appropriate row must already be open 146 | if (rd_ptr->command_issuable 147 | && (rd_ptr->next_command == COL_READ_CMD)) 148 | { 149 | count_col_hits[channel][rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank]++; 150 | issue_request_command (rd_ptr); 151 | // issue auto-precharge if possible 152 | if (count_col_hits[channel][rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank] >= CAPN && 153 | is_autoprecharge_allowed(channel, rd_ptr->dram_addr.rank, rd_ptr->dram_addr.bank)) 154 | if (issue_autoprecharge(channel, rd_ptr->dram_addr.rank, rd_ptr->dram_addr.bank)) 155 | count_col_hits[channel][rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank] = 0; 156 | 157 | return; 158 | } 159 | } 160 | 161 | LL_FOREACH (read_queue_head[channel], rd_ptr) 162 | { 163 | // no hits, so just issue other available commands 164 | if (rd_ptr->command_issuable) 165 | { 166 | issue_request_command (rd_ptr); 167 | count_col_hits[channel][rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank] = 0; 168 | break; 169 | } 170 | } 171 | } 172 | 173 | // if no commands have been issued, check and issue precharge commands 174 | if (!command_issued_current_cycle[channel]) 175 | { 176 | for (int i = 0; i < NUM_RANKS; i++) 177 | { 178 | for (int j = 0; j < NUM_BANKS; j++) 179 | { /* For all banks on the channel.. */ 180 | if (count_col_hits[channel][i][j] >= CAPN) 181 | { /* See if this bank is a candidate. */ 182 | if (is_precharge_allowed (channel, i, j)) 183 | { /* See if precharge is doable. */ 184 | if (issue_precharge_command (channel, i, j)) 185 | { 186 | count_col_hits[channel][i][j] = 0; 187 | } 188 | } 189 | } 190 | } 191 | } 192 | } 193 | } 194 | 195 | void 196 | scheduler_stats () 197 | { 198 | /* Nothing to print for now. */ 199 | } 200 | -------------------------------------------------------------------------------- /src/scheduler-orcw.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include "utlist.h" 3 | #include "utils.h" 4 | 5 | #include "memory_controller.h" 6 | #include "params.h" 7 | 8 | /* A basic FCFS policy augmented with a not-so-clever close-page policy. 9 | If the memory controller is unable to issue a command this cycle, find 10 | a bank that recently serviced a column-wr and close it (precharge it). */ 11 | 12 | 13 | extern long long int CYCLE_VAL; 14 | 15 | /* A data structure to see if a bank is a candidate for precharge. */ 16 | int recent_colacc[MAX_NUM_CHANNELS][MAX_NUM_RANKS][MAX_NUM_BANKS]; 17 | 18 | /* Keeping track of how many preemptive precharges are performed. */ 19 | long long int num_aggr_precharge = 0; 20 | 21 | void 22 | init_scheduler_vars () 23 | { 24 | // initialize all scheduler variables here 25 | int i, j, k; 26 | for (i = 0; i < MAX_NUM_CHANNELS; i++) 27 | { 28 | for (j = 0; j < MAX_NUM_RANKS; j++) 29 | { 30 | for (k = 0; k < MAX_NUM_BANKS; k++) 31 | { 32 | recent_colacc[i][j][k] = 0; 33 | } 34 | } 35 | } 36 | 37 | return; 38 | } 39 | 40 | // write queue high water mark; begin draining writes if write queue exceeds this value 41 | #define HI_WM 40 42 | 43 | // end write queue drain once write queue has this many writes in it 44 | #define LO_WM 20 45 | 46 | // 1 means we are in write-drain mode for that channel 47 | int drain_writes[MAX_NUM_CHANNELS]; 48 | 49 | /* Each cycle it is possible to issue a valid command from the read or write queues 50 | OR 51 | a valid precharge command to any bank (issue_precharge_command()) 52 | OR 53 | a valid precharge_all bank command to a rank (issue_all_bank_precharge_command()) 54 | OR 55 | a power_down command (issue_powerdown_command()), programmed either for fast or slow exit mode 56 | OR 57 | a refresh command (issue_refresh_command()) 58 | OR 59 | a power_up command (issue_powerup_command()) 60 | OR 61 | an activate to a specific row (issue_activate_command()). 62 | 63 | If a COL-RD or COL-WR is picked for issue, the scheduler also has the 64 | option to issue an auto-precharge in this cycle (issue_autoprecharge()). 65 | 66 | Before issuing a command it is important to check if it is issuable. For the RD/WR queue resident commands, checking the "command_issuable" flag is necessary. To check if the other commands (mentioned above) can be issued, it is important to check one of the following functions: is_precharge_allowed, is_all_bank_precharge_allowed, is_powerdown_fast_allowed, is_powerdown_slow_allowed, is_powerup_allowed, is_refresh_allowed, is_autoprecharge_allowed, is_activate_allowed. 67 | */ 68 | 69 | 70 | void 71 | schedule (int channel) 72 | { 73 | request_t *rd_ptr = NULL; 74 | request_t *wr_ptr = NULL; 75 | int i, j; 76 | 77 | 78 | // if in write drain mode, keep draining writes until the 79 | // write queue occupancy drops to LO_WM 80 | if (drain_writes[channel] && (write_queue_length[channel] > LO_WM)) 81 | { 82 | drain_writes[channel] = 1; // Keep draining. 83 | } 84 | else 85 | { 86 | drain_writes[channel] = 0; // No need to drain. 87 | } 88 | 89 | // initiate write drain if either the write queue occupancy 90 | // has reached the HI_WM , OR, if there are no pending read 91 | // requests 92 | if (write_queue_length[channel] > HI_WM) 93 | { 94 | drain_writes[channel] = 1; 95 | } 96 | else 97 | { 98 | if (!read_queue_length[channel]) 99 | drain_writes[channel] = 1; 100 | } 101 | 102 | 103 | // If in write drain mode, look through all the write queue 104 | // elements (already arranged in the order of arrival), and 105 | // issue the command for the first request that is ready 106 | if (drain_writes[channel]) 107 | { 108 | LL_FOREACH (write_queue_head[channel], wr_ptr) 109 | { 110 | if (wr_ptr->command_issuable) 111 | { 112 | /* Before issuing the command, see if this bank is now a candidate for closure (if it just did a column-rd/wr). 113 | If the bank just did an activate or precharge, it is not a candidate for closure. */ 114 | if (wr_ptr->next_command == COL_WRITE_CMD) 115 | { 116 | recent_colacc[channel][wr_ptr->dram_addr.rank][wr_ptr-> 117 | dram_addr. 118 | bank] = 1; 119 | } 120 | else if (wr_ptr->next_command == ACT_CMD) 121 | { 122 | recent_colacc[channel][wr_ptr->dram_addr.rank][wr_ptr-> 123 | dram_addr. 124 | bank] = 0; 125 | } 126 | else if (wr_ptr->next_command == PRE_CMD) 127 | { 128 | recent_colacc[channel][wr_ptr->dram_addr.rank][wr_ptr-> 129 | dram_addr. 130 | bank] = 0; 131 | } 132 | issue_request_command (wr_ptr); 133 | break; 134 | } 135 | } 136 | } 137 | 138 | // Draining Reads 139 | // look through the queue and find the first request whose 140 | // command can be issued in this cycle and issue it 141 | // Simple FCFS 142 | if (!drain_writes[channel]) 143 | { 144 | LL_FOREACH (read_queue_head[channel], rd_ptr) 145 | { 146 | if (rd_ptr->command_issuable) 147 | { 148 | issue_request_command (rd_ptr); 149 | break; 150 | } 151 | } 152 | } 153 | 154 | /* If a command hasn't yet been issued to this channel in this cycle, issue a precharge. */ 155 | if (!command_issued_current_cycle[channel]) 156 | { 157 | for (i = 0; i < NUM_RANKS; i++) 158 | { 159 | for (j = 0; j < NUM_BANKS; j++) 160 | { /* For all banks on the channel.. */ 161 | if (recent_colacc[channel][i][j]) 162 | { /* See if this bank is a candidate. */ 163 | if (is_precharge_allowed (channel, i, j)) 164 | { /* See if precharge is doable. */ 165 | if (issue_precharge_command (channel, i, j)) 166 | { 167 | num_aggr_precharge++; 168 | recent_colacc[channel][i][j] = 0; 169 | } 170 | } 171 | } 172 | } 173 | } 174 | } 175 | 176 | 177 | } 178 | 179 | void 180 | scheduler_stats () 181 | { 182 | /* Nothing to print for now. */ 183 | printf ("Number of aggressive precharges: %lld\n", num_aggr_precharge); 184 | } 185 | -------------------------------------------------------------------------------- /src/scheduler-perf.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include "utlist.h" 3 | #include "utils.h" 4 | 5 | #include "memory_controller.h" 6 | #include "params.h" 7 | 8 | extern long long int CYCLE_VAL; 9 | #define MAX_THREADS 64 10 | 11 | double threshold_open; 12 | long long accesses[MAX_NUM_CHANNELS][MAX_THREADS]; 13 | long int hits[MAX_NUM_CHANNELS][MAX_THREADS]; 14 | 15 | /* Keeping track of how many preemptive precharges are performed. */ 16 | long long int num_aggr_precharge = 0; 17 | 18 | void 19 | init_scheduler_vars () 20 | { 21 | threshold_open = T_RP / (T_RP+T_RCD); 22 | // initialize all scheduler variables here 23 | 24 | for (int i = 0; i < MAX_NUM_CHANNELS; i++) 25 | { 26 | for (int j = 0; j < MAX_THREADS; j++) 27 | { 28 | hits[i][j] = 0; 29 | accesses[i][j] = 0; 30 | } 31 | } 32 | 33 | return; 34 | } 35 | 36 | // write queue high water mark; begin draining writes if write queue exceeds this value 37 | #define HI_WM 40 38 | 39 | // end write queue drain once write queue has this many writes in it 40 | #define LO_WM 20 41 | 42 | // 1 means we are in write-drain mode for that channel 43 | int drain_writes[MAX_NUM_CHANNELS]; 44 | 45 | /* Each cycle it is possible to issue a valid command from the read or write queues 46 | OR 47 | a valid precharge command to any bank (issue_precharge_command()) 48 | OR 49 | a valid precharge_all bank command to a rank (issue_all_bank_precharge_command()) 50 | OR 51 | a power_down command (issue_powerdown_command()), programmed either for fast or slow exit mode 52 | OR 53 | a refresh command (issue_refresh_command()) 54 | OR 55 | a power_up command (issue_powerup_command()) 56 | OR 57 | an activate to a specific row (issue_activate_command()). 58 | 59 | If a COL-RD or COL-WR is picked for issue, the scheduler also has the 60 | option to issue an auto-precharge in this cycle (issue_autoprecharge()). 61 | 62 | Before issuing a command it is important to check if it is issuable. For the RD/WR queue resident commands, checking the "command_issuable" flag is necessary. To check if the other commands (mentioned above) can be issued, it is important to check one of the following functions: is_precharge_allowed, is_all_bank_precharge_allowed, is_powerdown_fast_allowed, is_powerdown_slow_allowed, is_powerup_allowed, is_refresh_allowed, is_autoprecharge_allowed, is_activate_allowed. 63 | */ 64 | 65 | void 66 | schedule (int channel) 67 | { 68 | request_t *rd_ptr = NULL; 69 | request_t *wr_ptr = NULL; 70 | 71 | 72 | // if in write drain mode, keep draining writes until the 73 | // write queue occupancy drops to LO_WM 74 | if (drain_writes[channel] && (write_queue_length[channel] > LO_WM)) 75 | { 76 | drain_writes[channel] = 1; // Keep draining. 77 | } 78 | else 79 | { 80 | drain_writes[channel] = 0; // No need to drain. 81 | } 82 | 83 | // initiate write drain if either the write queue occupancy 84 | // has reached the HI_WM , OR, if there are no pending read 85 | // requests 86 | if (write_queue_length[channel] > HI_WM) 87 | { 88 | drain_writes[channel] = 1; 89 | } 90 | else 91 | { 92 | if (!read_queue_length[channel]) 93 | drain_writes[channel] = 1; 94 | } 95 | 96 | 97 | // If in write drain mode, look through all the write queue 98 | // elements (already arranged in the order of arrival), and 99 | // issue the command for the first request that is ready 100 | if (drain_writes[channel]) 101 | { 102 | // prioritize open row hits 103 | LL_FOREACH (write_queue_head[channel], wr_ptr) 104 | { 105 | // if COL_WRITE_CMD is the next command, then that means the appropriate row must already be open 106 | if (wr_ptr->command_issuable 107 | && (wr_ptr->next_command == COL_WRITE_CMD)) 108 | { 109 | hits[channel][wr_ptr->thread_id]++; 110 | accesses[channel][wr_ptr->thread_id]++; 111 | issue_request_command (wr_ptr); 112 | 113 | // issue auto-precharge if possible 114 | if ((hits[channel][wr_ptr->thread_id] / accesses[channel][wr_ptr->thread_id]) >= threshold_open && 115 | is_autoprecharge_allowed(channel, wr_ptr->dram_addr.rank, wr_ptr->dram_addr.bank)) 116 | issue_autoprecharge(channel, wr_ptr->dram_addr.rank, wr_ptr->dram_addr.bank); 117 | 118 | return; 119 | } 120 | } 121 | 122 | LL_FOREACH (write_queue_head[channel], wr_ptr) 123 | { 124 | // if no open rows, just issue any other available commands 125 | if (wr_ptr->command_issuable) 126 | { 127 | issue_request_command (wr_ptr); 128 | accesses[channel][wr_ptr->thread_id]++; 129 | break; 130 | } 131 | } 132 | } 133 | 134 | // Draining Reads 135 | // look through the queue and find the first request whose 136 | // command can be issued in this cycle and issue it 137 | // Simple FCFS 138 | if (!drain_writes[channel]) 139 | { 140 | LL_FOREACH (read_queue_head[channel], rd_ptr) 141 | { 142 | // if COL_WRITE_CMD is the next command, then that means the appropriate row must already be open 143 | if (rd_ptr->command_issuable 144 | && (rd_ptr->next_command == COL_READ_CMD)) 145 | { 146 | hits[channel][rd_ptr->thread_id]++; 147 | accesses[channel][rd_ptr->thread_id]++; 148 | issue_request_command (rd_ptr); 149 | // issue auto-precharge if possible 150 | if ((hits[channel][rd_ptr->thread_id] / accesses[channel][rd_ptr->thread_id]) >= threshold_open && 151 | is_autoprecharge_allowed(channel, rd_ptr->dram_addr.rank, rd_ptr->dram_addr.bank)) 152 | issue_autoprecharge(channel, rd_ptr->dram_addr.rank, rd_ptr->dram_addr.bank); 153 | 154 | return; 155 | } 156 | } 157 | 158 | LL_FOREACH (read_queue_head[channel], rd_ptr) 159 | { 160 | // no hits, so just issue other available commands 161 | if (rd_ptr->command_issuable) 162 | { 163 | issue_request_command (rd_ptr); 164 | accesses[channel][rd_ptr->thread_id]++; 165 | break; 166 | } 167 | } 168 | } 169 | 170 | // if no commands have been issued, check and issue precharge commands 171 | if (!command_issued_current_cycle[channel]) 172 | { 173 | for (int i = 0; i < NUM_RANKS; i++) 174 | { 175 | for (int j = 0; j < NUM_BANKS; j++) 176 | { /* For all banks on the channel.. */ 177 | if (is_precharge_allowed (channel, i, j)) 178 | { /* See if precharge is doable. */ 179 | if (issue_precharge_command (channel, i, j)) 180 | num_aggr_precharge++; 181 | } 182 | } 183 | } 184 | } 185 | } 186 | 187 | void 188 | scheduler_stats () 189 | { 190 | printf ("Number of aggressive precharges: %lld\n", num_aggr_precharge); 191 | } 192 | -------------------------------------------------------------------------------- /src/scheduler-pwrdn.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include "utlist.h" 3 | #include "utils.h" 4 | 5 | #include "memory_controller.h" 6 | #include "params.h" 7 | 8 | 9 | /* A simple FCFS scheduler with an aggressive power-down policy. 10 | In any cycle, if the memory controller is unable to fire a 11 | command to a channel, it instead fires a POWER_DOWN_FAST 12 | command. */ 13 | 14 | 15 | extern long long int CYCLE_VAL; 16 | 17 | /* A variable to keep track of whether I've already fired 18 | a power down command to a rank. */ 19 | long long int pwrdn[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 20 | 21 | /* A stat to keep track of how long a rank stays in power-down mode. 22 | This matches up quite closely with a rank's time spent in ACT_PDN. */ 23 | long long int timedn[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 24 | 25 | // keep track of idle cycles 26 | long long int timeidle[MAX_NUM_CHANNELS][MAX_NUM_RANKS]; 27 | 28 | void 29 | init_scheduler_vars () 30 | { 31 | int i, j; 32 | // initialize all scheduler variables here 33 | for (i = 0; i < MAX_NUM_CHANNELS; i++) 34 | { 35 | for (j = 0; j < MAX_NUM_RANKS; j++) 36 | { 37 | /* Initializing pwrdn and timedn arrays. */ 38 | pwrdn[i][j] = 0; 39 | timedn[i][j] = 0; 40 | timeidle[i][j] = 0; 41 | } 42 | } 43 | 44 | return; 45 | } 46 | 47 | // write queue high water mark; begin draining writes if write queue exceeds this value 48 | #define HI_WM 40 49 | 50 | // end write queue drain once write queue has this many writes in it 51 | #define LO_WM 20 52 | 53 | // 1 means we are in write-drain mode for that channel 54 | int drain_writes[MAX_NUM_CHANNELS]; 55 | 56 | /* Each cycle it is possible to issue a valid command from the read or write queues 57 | OR 58 | a valid precharge command to any bank (issue_precharge_command()) 59 | OR 60 | a valid precharge_all bank command to a rank (issue_all_bank_precharge_command()) 61 | OR 62 | a power_down command (issue_powerdown_command()), programmed either for fast or slow exit mode 63 | OR 64 | a refresh command (issue_refresh_command()) 65 | OR 66 | a power_up command (issue_powerup_command()) 67 | OR 68 | an activate to a specific row (issue_activate_command()). 69 | 70 | If a COL-RD or COL-WR is picked for issue, the scheduler also has the 71 | option to issue an auto-precharge in this cycle (issue_autoprecharge()). 72 | 73 | Before issuing a command it is important to check if it is issuable. For the RD/WR queue resident commands, checking the "command_issuable" flag is necessary. To check if the other commands (mentioned above) can be issued, it is important to check one of the following functions: is_precharge_allowed, is_all_bank_precharge_allowed, is_powerdown_fast_allowed, is_powerdown_slow_allowed, is_powerup_allowed, is_refresh_allowed, is_autoprecharge_allowed, is_activate_allowed. 74 | */ 75 | 76 | 77 | void 78 | schedule (int channel) 79 | { 80 | request_t *rd_ptr = NULL; 81 | request_t *wr_ptr = NULL; 82 | int i; 83 | 84 | 85 | /* Since a refresh operation wakes a rank, mark the ranks as not being in power-dn mode. */ 86 | if ((CYCLE_VAL % (8 * T_REFI)) == 0) 87 | { 88 | // printf("All banks powered up after refresh %lld.\n",CYCLE_VAL); 89 | for (i = 0; i < NUM_RANKS; i++) 90 | { 91 | if (pwrdn[channel][i]) 92 | { 93 | timedn[channel][i] = 94 | timedn[channel][i] + CYCLE_VAL - pwrdn[channel][i]; 95 | pwrdn[channel][i] = 0; 96 | timeidle[channel][i] = 0; 97 | } 98 | } 99 | } 100 | 101 | 102 | // if in write drain mode, keep draining writes until the 103 | // write queue occupancy drops to LO_WM 104 | if (drain_writes[channel] && (write_queue_length[channel] > LO_WM)) 105 | { 106 | drain_writes[channel] = 1; // Keep draining. 107 | } 108 | else 109 | { 110 | drain_writes[channel] = 0; // No need to drain. 111 | } 112 | 113 | // initiate write drain if either the write queue occupancy 114 | // has reached the HI_WM , OR, if there are no pending read 115 | // requests 116 | if (write_queue_length[channel] > HI_WM) 117 | { 118 | drain_writes[channel] = 1; 119 | } 120 | else 121 | { 122 | if (!read_queue_length[channel]) 123 | drain_writes[channel] = 1; 124 | } 125 | 126 | 127 | // If in write drain mode, look through all the write queue 128 | // elements (already arranged in the order of arrival), and 129 | // issue the command for the first request that is ready 130 | if (drain_writes[channel]) 131 | { 132 | 133 | LL_FOREACH (write_queue_head[channel], wr_ptr) 134 | { 135 | if (wr_ptr->command_issuable) 136 | { 137 | if (issue_request_command(wr_ptr)) 138 | { 139 | /* If the command was successful, mark that the rank has now been woken up. Just book-keeping being done. */ 140 | if (pwrdn[wr_ptr->dram_addr.channel][wr_ptr->dram_addr.rank]) 141 | { 142 | timedn[wr_ptr->dram_addr.channel][wr_ptr->dram_addr.rank] = 143 | timedn[wr_ptr->dram_addr.channel][wr_ptr->dram_addr.rank] + 144 | CYCLE_VAL - pwrdn[wr_ptr->dram_addr.channel][wr_ptr->dram_addr.rank]; 145 | 146 | pwrdn[wr_ptr->dram_addr.channel][wr_ptr->dram_addr.rank] = 0; 147 | // printf("Powering up c%d r%d in cycle %lld\n", wr_ptr->dram_addr.channel, wr_ptr->dram_addr.rank, CYCLE_VAL); 148 | } 149 | timeidle[wr_ptr->dram_addr.channel][wr_ptr->dram_addr.rank] = 0; 150 | } 151 | 152 | break; 153 | } 154 | } 155 | /* If you were unable to drain any writes this cycle, go ahead and try to power down. */ 156 | for (i = 0; i < NUM_RANKS; i++) 157 | { 158 | if (!pwrdn[channel][i]) 159 | { 160 | if (timeidle[channel][i] >= PWRN) 161 | { 162 | if (is_powerdown_fast_allowed (channel, i)) 163 | { 164 | if (issue_powerdown_command (channel, i, PWR_DN_FAST_CMD)) 165 | { 166 | pwrdn[channel][i] = CYCLE_VAL; 167 | timeidle[channel][i] = 0; 168 | // printf("Powered down c%d r%d in cycle %lld\n", channel, i, CYCLE_VAL); 169 | return; 170 | } 171 | } 172 | } 173 | timeidle[channel][i]++; 174 | } 175 | } 176 | return; 177 | } 178 | 179 | // Draining Reads 180 | // look through the queue and find the first request whose 181 | // command can be issued in this cycle and issue it 182 | // Simple FCFS 183 | if (!drain_writes[channel]) 184 | { 185 | LL_FOREACH (read_queue_head[channel], rd_ptr) 186 | { 187 | if (rd_ptr->command_issuable) 188 | { 189 | if (issue_request_command (rd_ptr)) 190 | { 191 | /* If the command was successful, mark that the rank has now been woken up. Just book-keeping being done. */ 192 | if (pwrdn[rd_ptr->dram_addr.channel][rd_ptr->dram_addr.rank]) 193 | { 194 | timedn[rd_ptr->dram_addr.channel][rd_ptr->dram_addr.rank] = 195 | timedn[rd_ptr->dram_addr.channel][rd_ptr->dram_addr.rank] + 196 | CYCLE_VAL - pwrdn[rd_ptr->dram_addr.channel][rd_ptr->dram_addr.rank]; 197 | pwrdn[rd_ptr->dram_addr.channel][rd_ptr->dram_addr.rank] = 0; 198 | // printf("Powering up c%d r%d in cycle %lld\n", rd_ptr->dram_addr.channel, rd_ptr->dram_addr.rank, CYCLE_VAL); 199 | } 200 | timeidle[rd_ptr->dram_addr.channel][rd_ptr->dram_addr.rank] = 0; 201 | } 202 | break; 203 | } 204 | } 205 | /* If you were unable to issue any reads this cycle, go ahead and try to power down. */ 206 | for (i = 0; i < NUM_RANKS; i++) 207 | { 208 | if (!pwrdn[channel][i]) 209 | { 210 | if (timeidle[channel][i] >= PWRN) 211 | { 212 | if (is_powerdown_fast_allowed (channel, i)) 213 | { 214 | if (issue_powerdown_command (channel, i, PWR_DN_FAST_CMD)) 215 | { 216 | pwrdn[channel][i] = CYCLE_VAL; 217 | timeidle[channel][i] = 0; 218 | // printf("Powered down c%d r%d in cycle %lld\n", channel, i, CYCLE_VAL); 219 | return; 220 | } 221 | } 222 | } 223 | timeidle[channel][i]++; 224 | } 225 | } 226 | return; 227 | } 228 | } 229 | 230 | void 231 | scheduler_stats () 232 | { 233 | int i, j; 234 | printf ("Printing scheduler stats\n"); 235 | for (i = 0; i < NUM_CHANNELS; i++) 236 | { 237 | for (j = 0; j < NUM_RANKS; j++) 238 | { 239 | printf ("Power down time c%d r%d %lld\n", i, j, timedn[i][j]); 240 | } 241 | } 242 | } 243 | -------------------------------------------------------------------------------- /src/scheduler-pwrdn.h: -------------------------------------------------------------------------------- 1 | #ifndef __SCHEDULER_H__ 2 | #define __SCHEDULER_H__ 3 | 4 | void init_scheduler_vars(); //called from main 5 | void scheduler_stats(); //called from main 6 | void schedule(int); // scheduler function called every cycle 7 | 8 | #endif //__SCHEDULER_H__ 9 | 10 | -------------------------------------------------------------------------------- /src/scheduler-stride.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include "utlist.h" 3 | #include "utils.h" 4 | #include "params.h" 5 | #include 6 | 7 | #include "memory_controller.h" 8 | 9 | #define MAXGHBSIZE 512 10 | #define MAXINDEXTABLE 1024 11 | 12 | 13 | //function to convert physical to DRAM address 14 | dram_address_t calc_dram_addr_copy(long long int physical_address) 15 | { 16 | 17 | 18 | long long int input_a, temp_b, temp_a; 19 | 20 | int channelBitWidth = log_base2(NUM_CHANNELS); 21 | int rankBitWidth = log_base2(NUM_RANKS); 22 | int bankBitWidth = log_base2(NUM_BANKS); 23 | int rowBitWidth = log_base2(NUM_ROWS); 24 | int colBitWidth = log_base2(NUM_COLUMNS); 25 | int byteOffsetWidth = log_base2(CACHE_LINE_SIZE); 26 | 27 | 28 | 29 | dram_address_t this_a ; 30 | 31 | this_a.actual_address = physical_address; 32 | 33 | input_a = physical_address; 34 | 35 | input_a = input_a >> byteOffsetWidth; // strip out the cache_offset 36 | 37 | 38 | if(ADDRESS_MAPPING == 1) 39 | { 40 | temp_b = input_a; 41 | input_a = input_a >> colBitWidth; 42 | temp_a = input_a << colBitWidth; 43 | this_a.column = temp_a ^ temp_b; //strip out the column address 44 | 45 | 46 | temp_b = input_a; 47 | input_a = input_a >> channelBitWidth; 48 | temp_a = input_a << channelBitWidth; 49 | this_a.channel = temp_a ^ temp_b; // strip out the channel address 50 | 51 | 52 | temp_b = input_a; 53 | input_a = input_a >> bankBitWidth; 54 | temp_a = input_a << bankBitWidth; 55 | this_a.bank = temp_a ^ temp_b; // strip out the bank address 56 | 57 | 58 | temp_b = input_a; 59 | input_a = input_a >> rankBitWidth; 60 | temp_a = input_a << rankBitWidth; 61 | this_a.rank = temp_a ^ temp_b; // strip out the rank address 62 | 63 | 64 | temp_b = input_a; 65 | input_a = input_a >> rowBitWidth; 66 | temp_a = input_a << rowBitWidth; 67 | this_a.row = temp_a ^ temp_b; // strip out the row number 68 | } 69 | else 70 | { 71 | temp_b = input_a; 72 | input_a = input_a >> channelBitWidth; 73 | temp_a = input_a << channelBitWidth; 74 | this_a.channel = temp_a ^ temp_b; // strip out the channel address 75 | 76 | 77 | temp_b = input_a; 78 | input_a = input_a >> bankBitWidth; 79 | temp_a = input_a << bankBitWidth; 80 | this_a.bank = temp_a ^ temp_b; // strip out the bank address 81 | 82 | 83 | temp_b = input_a; 84 | input_a = input_a >> rankBitWidth; 85 | temp_a = input_a << rankBitWidth; 86 | this_a.rank = temp_a ^ temp_b; // strip out the rank address 87 | 88 | 89 | temp_b = input_a; 90 | input_a = input_a >> colBitWidth; 91 | temp_a = input_a << colBitWidth; 92 | this_a.column = temp_a ^ temp_b; //strip out the column address 93 | 94 | 95 | temp_b = input_a; 96 | input_a = input_a >> rowBitWidth; 97 | temp_a = input_a << rowBitWidth; 98 | this_a.row = temp_a ^ temp_b; // strip out the row number 99 | } 100 | return(this_a); 101 | } 102 | 103 | 104 | //GHB variables (global) 105 | int GHBhead; 106 | int GHBmaxed; 107 | 108 | 109 | struct GHBentry 110 | { 111 | int number; 112 | int thread_id; 113 | int channel; 114 | int rank; 115 | int bank; 116 | long long int row; 117 | long long int instruction_pc; 118 | long long int physical_address; 119 | struct GHBentry * link; 120 | }; 121 | 122 | 123 | //GHB 124 | struct GHBentry GHB[MAXGHBSIZE]; 125 | 126 | 127 | struct ToBeIssued 128 | { 129 | int issue; 130 | int rank; 131 | int bank; 132 | long long int row; 133 | }; 134 | 135 | struct ToBeIssued tbi[MAX_NUM_CHANNELS]; 136 | 137 | 138 | 139 | struct StrideTableentry 140 | { 141 | int laststride; 142 | long long int prev_address; 143 | int detected; 144 | }; 145 | 146 | 147 | //variables required for stats print 148 | int number_of_spec_activates; 149 | int number_of_hits; 150 | 151 | int activates[MAX_NUM_CHANNELS][MAX_NUM_RANKS][MAX_NUM_BANKS]; 152 | 153 | 154 | //stride table 155 | struct StrideTableentry ST[1024]; 156 | 157 | 158 | 159 | extern long long int CYCLE_VAL; 160 | 161 | //previous read queue sizes 162 | int prev_rqsize[MAX_NUM_CHANNELS]; 163 | 164 | //index table 165 | struct GHBentry * IndexTable[MAXINDEXTABLE]; 166 | 167 | //ggenerate index for inex table 168 | int generateindex ( int thread_id, long long int instruction_pc, long long int physical_address) 169 | { 170 | long long int xorred; 171 | //printf("thread_id is %X, instr_pc is %llX, addr is %llX \n",thread_id, instruction_pc, physical_address); 172 | xorred = instruction_pc ^ physical_address; 173 | xorred = xorred & 0x00000000000000FF; 174 | thread_id = thread_id & 0x00000003; 175 | thread_id = thread_id << 8; 176 | //printf("index is %d %X\n",thread_id + (int)xorred); 177 | return (thread_id + (int)xorred); 178 | } 179 | 180 | //push entry into GHB 181 | void push ( dram_address_t dram_addr,int thread_id, long long int instruction_pc, long long int physical_address ) 182 | { 183 | struct GHBentry * loop = NULL; 184 | int GHBnewhead = (GHBhead+1)%MAXGHBSIZE; //head incremented 185 | 186 | 187 | if(GHBhead == MAXGHBSIZE) 188 | GHBmaxed = 1; 189 | 190 | int index; 191 | 192 | if(GHBmaxed == 1) //if GHB is maxed, then tail entry must be removed 193 | { 194 | index = generateindex(GHB[GHBnewhead].thread_id, GHB[GHBnewhead].instruction_pc, GHB[GHBnewhead].physical_address); 195 | loop = IndexTable[index]; 196 | if(loop == &GHB[GHBnewhead]) 197 | { 198 | IndexTable[index] = NULL; 199 | } 200 | else 201 | { 202 | 203 | if(loop->link == &GHB[GHBnewhead]) 204 | { 205 | loop->link = NULL; 206 | 207 | } 208 | 209 | } 210 | } 211 | 212 | 213 | //insert GHB entry 214 | GHB[GHBnewhead].thread_id = thread_id; 215 | GHB[GHBnewhead].channel = dram_addr.channel; 216 | GHB[GHBnewhead].rank = dram_addr.rank; 217 | GHB[GHBnewhead].bank = dram_addr.bank; 218 | GHB[GHBnewhead].row = dram_addr.row; 219 | GHB[GHBnewhead].physical_address = physical_address; 220 | GHB[GHBnewhead].instruction_pc = instruction_pc; 221 | index = generateindex(thread_id, instruction_pc, physical_address); 222 | GHB[GHBnewhead].link = IndexTable[index]; 223 | IndexTable[index] = &GHB[GHBnewhead]; 224 | 225 | GHBhead = GHBnewhead; 226 | 227 | 228 | 229 | 230 | 231 | } 232 | 233 | 234 | void init_scheduler_vars() 235 | { 236 | int i; 237 | // initialize all scheduler variables here 238 | for (i=0; inext) 323 | { 324 | 325 | push(updater->dram_addr,updater->thread_id, updater->instruction_pc, updater->physical_address); 326 | 327 | } 328 | prev_rqsize[channel] = read_queue_length[channel]; 329 | 330 | 331 | // if in write drain mode, keep draining writes until the 332 | // write queue occupancy drops to LO_WM 333 | if (drain_writes[channel] && (write_queue_length[channel] > LO_WM)) { 334 | drain_writes[channel] = 1; // Keep draining. 335 | } 336 | else { 337 | drain_writes[channel] = 0; // No need to drain. 338 | } 339 | 340 | // initiate write drain if either the write queue occupancy 341 | // has reached the HI_WM , OR, if there are no pending read 342 | // requests 343 | if(write_queue_length[channel] > HI_WM) 344 | { 345 | drain_writes[channel] = 1; 346 | } 347 | else { 348 | if (!read_queue_length[channel]) 349 | drain_writes[channel] = 1; 350 | } 351 | 352 | int j; 353 | 354 | int o; 355 | int p; 356 | 357 | int Isused[MAX_NUM_RANKS][MAX_NUM_BANKS]; 358 | for (o=0; odram_addr.rank][wr_ptr->dram_addr.bank]==0) 372 | Isused[wr_ptr->dram_addr.rank][wr_ptr->dram_addr.bank]=2; 373 | 374 | //initialise stride table 375 | ST[wr_ptr->instruction_pc%1024].laststride = 0; 376 | ST[wr_ptr->instruction_pc%1024].prev_address = 0; 377 | ST[wr_ptr->instruction_pc%1024].detected = 0; 378 | 379 | //first ready served 380 | if(wr_ptr->command_issuable && wr_ptr->next_command == COL_WRITE_CMD) 381 | { 382 | if(activates[channel][wr_ptr->dram_addr.rank][wr_ptr->dram_addr.bank] == 1) 383 | number_of_hits ++; 384 | issue_request_command(wr_ptr); 385 | tbi[channel].issue = 0; 386 | return; 387 | } 388 | 389 | } 390 | LL_FOREACH(write_queue_head[channel], wr_ptr) 391 | { 392 | 393 | //detect strides 394 | if(ST[wr_ptr->instruction_pc%1024].laststride == 0 && ST[wr_ptr->instruction_pc%1024].prev_address == 0) 395 | { 396 | ST[wr_ptr->instruction_pc%1024].prev_address = wr_ptr->physical_address; 397 | } 398 | else if ( ST[wr_ptr->instruction_pc%1024].laststride == 0 ) 399 | { 400 | ST[wr_ptr->instruction_pc%1024].laststride = wr_ptr->physical_address - ST[wr_ptr->instruction_pc%1024].prev_address; 401 | ST[wr_ptr->instruction_pc%1024].prev_address = wr_ptr->physical_address; 402 | } 403 | else if (ST[wr_ptr->instruction_pc%1024].laststride == wr_ptr->physical_address - ST[wr_ptr->instruction_pc%1024].prev_address) 404 | { 405 | ST[wr_ptr->instruction_pc%1024].detected = 1; 406 | ST[wr_ptr->instruction_pc%1024].prev_address = wr_ptr->physical_address; 407 | } 408 | 409 | if(wr_ptr->command_issuable) 410 | { 411 | if(wr_ptr->next_command == PRE_CMD) 412 | activates[channel][wr_ptr->dram_addr.rank][wr_ptr->dram_addr.bank] = 0 ; 413 | issue_request_command(wr_ptr); 414 | tbi[channel].issue = 0; 415 | return; 416 | } 417 | 418 | } 419 | 420 | 421 | 422 | 423 | 424 | 425 | 426 | } 427 | 428 | 429 | 430 | 431 | LL_FOREACH(read_queue_head[channel],rd_ptr) 432 | { 433 | ST[rd_ptr->instruction_pc%1024].laststride = 0; 434 | ST[rd_ptr->instruction_pc%1024].prev_address = 0; 435 | ST[rd_ptr->instruction_pc%1024].detected = 0; 436 | 437 | 438 | 439 | 440 | if(Isused[rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank]==0) 441 | Isused[rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank]=1; 442 | else if (Isused[rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank]==2) 443 | Isused[rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank]=3; 444 | 445 | 446 | if(rd_ptr->command_issuable && rd_ptr->next_command == COL_READ_CMD && !drain_writes[channel] && Isused[rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank]<2) 447 | { 448 | if(activates[channel][rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank] == 1) 449 | number_of_hits ++; 450 | issue_request_command(rd_ptr); 451 | prev_rqsize[channel] = prev_rqsize[channel] - 1; 452 | tbi[channel].issue = 0; 453 | return; 454 | } 455 | } 456 | 457 | 458 | LL_FOREACH(read_queue_head[channel],rd_ptr) 459 | { 460 | if(ST[rd_ptr->instruction_pc%1024].laststride == 0 && ST[rd_ptr->instruction_pc%1024].prev_address == 0) 461 | { 462 | ST[rd_ptr->instruction_pc%1024].prev_address = rd_ptr->physical_address; 463 | } 464 | else if ( ST[rd_ptr->instruction_pc%1024].laststride == 0 ) 465 | { 466 | ST[rd_ptr->instruction_pc%1024].laststride = rd_ptr->physical_address - ST[rd_ptr->instruction_pc%1024].prev_address; 467 | ST[rd_ptr->instruction_pc%1024].prev_address = rd_ptr->physical_address; 468 | } 469 | else if (ST[rd_ptr->instruction_pc%1024].laststride == rd_ptr->physical_address - ST[rd_ptr->instruction_pc%1024].prev_address) 470 | { 471 | ST[rd_ptr->instruction_pc%1024].detected = 1; 472 | ST[rd_ptr->instruction_pc%1024].prev_address = rd_ptr->physical_address; 473 | } 474 | 475 | if(rd_ptr->command_issuable && Isused[rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank]<2) 476 | { 477 | if(rd_ptr->next_command == PRE_CMD) 478 | activates[channel][rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank] = 0 ; 479 | issue_request_command(rd_ptr); 480 | tbi[channel].issue = 0; 481 | return; 482 | } 483 | } 484 | //precharge from write queue 485 | LL_FOREACH(write_queue_head[channel], wr_ptr) 486 | { 487 | if(wr_ptr->command_issuable && wr_ptr->next_command == PRE_CMD && Isused[wr_ptr->dram_addr.rank][wr_ptr->dram_addr.bank]%2==0) 488 | { 489 | activates[channel][wr_ptr->dram_addr.rank][wr_ptr->dram_addr.bank] =0; 490 | issue_request_command(wr_ptr); 491 | tbi[channel].issue = 0; 492 | return; 493 | } 494 | 495 | } 496 | 497 | //try from stride detector 498 | if(drain_writes[channel]) 499 | { 500 | LL_FOREACH(write_queue_head[channel], wr_ptr) 501 | { 502 | 503 | if(ST[wr_ptr->instruction_pc%1024].detected == 1) 504 | { 505 | for(j=1;j<7;j++) 506 | { 507 | long long int next_physical= ST[wr_ptr->instruction_pc%1024].prev_address + j*ST[wr_ptr->instruction_pc%1024].laststride; 508 | dram_address_t next_dram_addr=calc_dram_addr_copy(next_physical); 509 | dram_address_t prev_address = calc_dram_addr_copy(ST[wr_ptr->instruction_pc%1024].prev_address); 510 | if (next_dram_addr.channel==channel) 511 | { 512 | if((prev_address.rank!=next_dram_addr.rank)||(prev_address.bank!=next_dram_addr.bank)||(prev_address.row!=next_dram_addr.row)) 513 | { 514 | if(Isused[next_dram_addr.rank][next_dram_addr.bank]==0 && dram_state[channel][next_dram_addr.rank][next_dram_addr.bank].state == PRECHARGING && is_activate_allowed(channel, next_dram_addr.rank, next_dram_addr.bank)) 515 | { 516 | issue_activate_command(channel, next_dram_addr.rank, next_dram_addr.bank, next_dram_addr.row); 517 | //free(next_dram_addr); 518 | activates[channel][next_dram_addr.rank][next_dram_addr.bank] =1; 519 | number_of_spec_activates = number_of_spec_activates+1; 520 | tbi[channel].issue = 0; 521 | return; 522 | } 523 | if(Isused[next_dram_addr.rank][next_dram_addr.bank]==0 && dram_state[channel][next_dram_addr.rank][next_dram_addr.bank].state == ROW_ACTIVE && is_precharge_allowed(channel, next_dram_addr.rank, next_dram_addr.bank)) 524 | { 525 | issue_precharge_command(channel, next_dram_addr.rank, next_dram_addr.bank); 526 | activates[channel][next_dram_addr.rank][next_dram_addr.bank] =0; 527 | 528 | tbi[channel].issue = 0; 529 | //free(next_dram_addr); 530 | return; 531 | } 532 | } 533 | 534 | } 535 | else 536 | { 537 | tbi[next_dram_addr.channel].issue = 1; 538 | tbi[next_dram_addr.channel].rank = next_dram_addr.rank; 539 | tbi[next_dram_addr.channel].bank = next_dram_addr.bank; 540 | tbi[next_dram_addr.channel].row = next_dram_addr.row; 541 | 542 | } 543 | } 544 | ST[wr_ptr->instruction_pc%1024].detected = 0; 545 | 546 | } 547 | 548 | 549 | 550 | } 551 | } 552 | 553 | LL_FOREACH(read_queue_head[channel], rd_ptr) 554 | { 555 | 556 | if(ST[rd_ptr->instruction_pc%1024].detected == 1) 557 | { 558 | for(j=1;j<7;j++) 559 | { 560 | long long int next_physical= ST[rd_ptr->instruction_pc%1024].prev_address + j*ST[rd_ptr->instruction_pc%1024].laststride; 561 | dram_address_t next_dram_addr=calc_dram_addr_copy(next_physical); 562 | dram_address_t prev_address = calc_dram_addr_copy(ST[rd_ptr->instruction_pc%1024].prev_address); 563 | if (next_dram_addr.channel==channel) 564 | { 565 | if((prev_address.rank!=next_dram_addr.rank)||(prev_address.bank!=next_dram_addr.bank)||(prev_address.row!=next_dram_addr.row)) 566 | { 567 | if(Isused[next_dram_addr.rank][next_dram_addr.bank]==0 && dram_state[channel][next_dram_addr.rank][next_dram_addr.bank].state == PRECHARGING && is_activate_allowed(channel, next_dram_addr.rank, next_dram_addr.bank)) 568 | { 569 | issue_activate_command(channel, next_dram_addr.rank, next_dram_addr.bank, next_dram_addr.row); 570 | activates[channel][next_dram_addr.rank][next_dram_addr.bank] =1; 571 | number_of_spec_activates = number_of_spec_activates+1; 572 | tbi[channel].issue = 0; 573 | return; 574 | } 575 | if(Isused[next_dram_addr.rank][next_dram_addr.bank]==0 && dram_state[channel][next_dram_addr.rank][next_dram_addr.bank].state == ROW_ACTIVE && is_precharge_allowed(channel, next_dram_addr.rank, next_dram_addr.bank)) 576 | { 577 | issue_precharge_command(channel, next_dram_addr.rank, next_dram_addr.bank); 578 | tbi[channel].issue = 0; 579 | activates[channel][next_dram_addr.rank][next_dram_addr.bank] =0; 580 | 581 | return; 582 | } 583 | } 584 | 585 | } 586 | } 587 | ST[rd_ptr->instruction_pc%1024].detected = 0; 588 | 589 | } 590 | } 591 | 592 | 593 | //try from GHB 594 | 595 | int index; 596 | struct GHBentry * loop = NULL; 597 | struct GHBentry * head = NULL; 598 | 599 | 600 | head = &GHB[GHBhead]; 601 | 602 | int r; 603 | for(r=0;r<3;r++) 604 | { 605 | 606 | loop = head->link; 607 | head = head->link; 608 | 609 | 610 | if(loop!=NULL) 611 | index = ((loop->number)+1)%MAXGHBSIZE; 612 | else 613 | return; 614 | 615 | for(j=0;j<20;j++) 616 | { 617 | if(index == GHBhead) 618 | break; 619 | 620 | if(Isused[GHB[index].rank][GHB[index].bank] != 0 || channel != GHB[index].channel) 621 | { 622 | index = (index + 1)%MAXGHBSIZE; 623 | continue; 624 | } 625 | 626 | 627 | if (dram_state[GHB[index].channel][GHB[index].rank][GHB[index].bank].state == ROW_ACTIVE) 628 | { 629 | if(dram_state[GHB[index].channel][GHB[index].rank][GHB[index].bank].active_row != GHB[index].row) 630 | { 631 | if(is_precharge_allowed(GHB[index].channel,GHB[index].rank,GHB[index].bank)) 632 | { 633 | issue_precharge_command(GHB[index].channel,GHB[index].rank,GHB[index].bank); 634 | 635 | activates[GHB[index].channel][GHB[index].rank][GHB[index].bank] = 0 ; 636 | 637 | tbi[channel].issue = 0; 638 | return; 639 | } 640 | } 641 | 642 | } 643 | index = (index + 1)%MAXGHBSIZE; 644 | } 645 | if(tbi[channel].issue == 1) 646 | { 647 | if(!Isused[tbi[channel].rank][tbi[channel].bank] && dram_state[channel][tbi[channel].rank][tbi[channel].bank].state == PRECHARGING && is_activate_allowed(channel, tbi[channel].rank, tbi[channel].bank)) 648 | { 649 | issue_activate_command(channel, tbi[channel].rank, tbi[channel].bank, tbi[channel].row); 650 | //free(next_dram_addr); 651 | activates[channel][tbi[channel].rank][tbi[channel].bank] =1; 652 | number_of_spec_activates = number_of_spec_activates+1; 653 | 654 | tbi[channel].issue = 0; 655 | return; 656 | } 657 | if(!Isused[tbi[channel].rank][tbi[channel].bank] && dram_state[channel][tbi[channel].rank][tbi[channel].bank].state == ROW_ACTIVE && is_precharge_allowed(channel, tbi[channel].rank, tbi[channel].bank)) 658 | { 659 | issue_precharge_command(channel, tbi[channel].rank, tbi[channel].bank); 660 | tbi[channel].issue = 0; 661 | activates[channel][tbi[channel].rank][tbi[channel].bank] =0; 662 | //free(next_dram_addr); 663 | return; 664 | } 665 | 666 | } 667 | 668 | } 669 | } 670 | 671 | void scheduler_stats() 672 | { 673 | printf("\nNumber of speculative activates = %d ", number_of_spec_activates); 674 | printf("\nNumber of row hits = %d ", number_of_hits); 675 | 676 | 677 | /* Nothing to print for now. */ 678 | } 679 | -------------------------------------------------------------------------------- /src/scheduler-test.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include "utlist.h" 3 | #include "utils.h" 4 | 5 | #include "memory_controller.h" 6 | #include "params.h" 7 | 8 | /* A scheduling algorithm based on Priority Based Fair Scheduling policy 9 | * 10 | * A basic FCFS policy augmented with a clever close-page policy. 11 | Instead of immediately closing the page, wait for a few idle cycles to close 12 | the page based on the hit rate of the thread on each core which has last accessed this 13 | row. 14 | 15 | This is as follows: hit rate = hits_in_row_buffer / accesses 16 | 17 | break even hit rate to keep page open is T_RP / (T_RP+T_RCD) 18 | 19 | If the memory controller is unable to issue a command this cycle, find 20 | a bank that recently serviced a column-wr and close it (precharge it). */ 21 | 22 | 23 | extern long long int CYCLE_VAL; 24 | #define MAX_THREADS 64 25 | 26 | long CAPN; 27 | 28 | /* A data structure to see if a bank is a candidate for precharge. */ 29 | int recent_colacc[MAX_NUM_CHANNELS][MAX_NUM_RANKS][MAX_NUM_BANKS]; 30 | 31 | /* Keeping track of how many preemptive precharges are performed. */ 32 | long long int num_aggr_precharge = 0; 33 | double priority[MAX_NUM_CHANNELS][MAX_THREADS]; 34 | long long accesses[MAX_NUM_CHANNELS][MAX_THREADS]; 35 | long long hits[MAX_NUM_CHANNELS][MAX_THREADS]; 36 | 37 | int get_core_highest_priority(int channel) 38 | { 39 | int max_index = 0; 40 | for (int i = 0; i < MAX_THREADS; i++) 41 | if (priority[channel][max_index] > priority[channel][i]) 42 | max_index = i; 43 | 44 | return max_index; 45 | } 46 | 47 | 48 | void 49 | init_scheduler_vars () 50 | { 51 | CAPN = T_RP / (T_RP+T_RCD); 52 | // initialize all scheduler variables here 53 | int i, j, k; 54 | for (i = 0; i < MAX_NUM_CHANNELS; i++) 55 | { 56 | for (j = 0; j < MAX_NUM_RANKS; j++) 57 | { 58 | for (k = 0; k < MAX_NUM_BANKS; k++) 59 | { 60 | recent_colacc[i][j][k] = 0; 61 | } 62 | } 63 | } 64 | for(int channel = 0; channel < NUM_CHANNELS; channel++) 65 | for(int core =0; core < NUMCORES; core++) { 66 | accesses[channel][core] = 0; 67 | hits[channel][core] = 0; 68 | } 69 | 70 | 71 | 72 | return; 73 | } 74 | 75 | // write queue high water mark; begin draining writes if write queue exceeds this value 76 | #define HI_WM 40 77 | 78 | // end write queue drain once write queue has this many writes in it 79 | #define LO_WM 20 80 | 81 | // 1 means we are in write-drain mode for that channel 82 | int drain_writes[MAX_NUM_CHANNELS]; 83 | 84 | /* Each cycle it is possible to issue a valid command from the read or write queues 85 | OR 86 | a valid precharge command to any bank (issue_precharge_command()) 87 | OR 88 | a valid precharge_all bank command to a rank (issue_all_bank_precharge_command()) 89 | OR 90 | a power_down command (issue_powerdown_command()), programmed either for fast or slow exit mode 91 | OR 92 | a refresh command (issue_refresh_command()) 93 | OR 94 | a power_up command (issue_powerup_command()) 95 | OR 96 | an activate to a specific row (issue_activate_command()). 97 | 98 | If a COL-RD or COL-WR is picked for issue, the scheduler also has the 99 | option to issue an auto-precharge in this cycle (issue_autoprecharge()). 100 | 101 | Before issuing a command it is important to check if it is issuable. For the RD/WR queue resident commands, checking the "command_issuable" flag is necessary. To check if the other commands (mentioned above) can be issued, it is important to check one of the following functions: is_precharge_allowed, is_all_bank_precharge_allowed, is_powerdown_fast_allowed, is_powerdown_slow_allowed, is_powerup_allowed, is_refresh_allowed, is_autoprecharge_allowed, is_activate_allowed. 102 | */ 103 | 104 | 105 | void 106 | schedule (int channel) 107 | { 108 | request_t *rd_ptr = NULL; 109 | request_t *wr_ptr = NULL; 110 | int i, j; 111 | 112 | 113 | // if in write drain mode, keep draining writes until the 114 | // write queue occupancy drops to LO_WM 115 | if (drain_writes[channel] && (write_queue_length[channel] > LO_WM)) 116 | { 117 | drain_writes[channel] = 1; // Keep draining. 118 | } 119 | else 120 | { 121 | drain_writes[channel] = 0; // No need to drain. 122 | } 123 | 124 | // initiate write drain if either the write queue occupancy 125 | // has reached the HI_WM , OR, if there are no pending read 126 | // requests 127 | if (write_queue_length[channel] > HI_WM) 128 | { 129 | drain_writes[channel] = 1; 130 | } 131 | else 132 | { 133 | if (!read_queue_length[channel]) 134 | drain_writes[channel] = 1; 135 | } 136 | 137 | 138 | // If in write drain mode, look through all the write queue 139 | // elements (already arranged in the order of arrival), and 140 | // issue the command for the first request that is ready 141 | if (drain_writes[channel]) 142 | { 143 | 144 | LL_FOREACH (write_queue_head[channel], wr_ptr) 145 | { 146 | if (wr_ptr->command_issuable) 147 | { 148 | /* Before issuing the command, see if this bank is now a candidate for closure (if it just did a column-rd/wr). 149 | If the bank just did an activate or precharge, it is not a candidate for closure. */ 150 | if (wr_ptr->next_command == COL_WRITE_CMD) 151 | { 152 | if (wr_ptr->thread_id != get_core_highest_priority(channel)) 153 | recent_colacc[channel][wr_ptr->dram_addr.rank][wr_ptr->dram_addr.bank] = 1; 154 | else 155 | recent_colacc[channel][wr_ptr->dram_addr.rank][wr_ptr->dram_addr.bank] = 0; 156 | hits[channel][wr_ptr->thread_id]++; 157 | } 158 | if (wr_ptr->next_command == ACT_CMD) 159 | { 160 | recent_colacc[channel][wr_ptr->dram_addr.rank][wr_ptr->dram_addr.bank] = 0; 161 | } 162 | if (wr_ptr->next_command == PRE_CMD) 163 | { 164 | recent_colacc[channel][wr_ptr->dram_addr.rank][wr_ptr->dram_addr.bank] = 0; 165 | } 166 | issue_request_command (wr_ptr); 167 | accesses[channel][wr_ptr->thread_id]++; 168 | break; 169 | } 170 | } 171 | } 172 | 173 | // Draining Reads 174 | // look through the queue and find the first request whose 175 | // command can be issued in this cycle and issue it 176 | // Simple FCFS 177 | if (!drain_writes[channel]) 178 | { 179 | LL_FOREACH (read_queue_head[channel], rd_ptr) 180 | { 181 | if (rd_ptr->command_issuable) 182 | { 183 | /* Before issuing the command, see if this bank is now a candidate for closure (if it just did a column-rd/wr). 184 | If the bank just did an activate or precharge, it is not a candidate for closure. */ 185 | if (rd_ptr->next_command == COL_READ_CMD) 186 | { 187 | if (rd_ptr->thread_id != get_core_highest_priority(channel)) 188 | recent_colacc[channel][rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank] = 1; 189 | else 190 | recent_colacc[channel][rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank] = 0; 191 | hits[channel][rd_ptr->thread_id]++; 192 | } 193 | if (rd_ptr->next_command == ACT_CMD) 194 | { 195 | recent_colacc[channel][rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank] = 0; 196 | } 197 | if (rd_ptr->next_command == PRE_CMD) 198 | { 199 | recent_colacc[channel][rd_ptr->dram_addr.rank][rd_ptr->dram_addr.bank] = 0; 200 | } 201 | issue_request_command (rd_ptr); 202 | accesses[channel][rd_ptr->thread_id]++; 203 | break; 204 | } 205 | } 206 | } 207 | 208 | /* If a command hasn't yet been issued to this channel in this cycle, issue a precharge. */ 209 | if (!command_issued_current_cycle[channel]) 210 | { 211 | for (i = 0; i < NUM_RANKS; i++) 212 | { 213 | for (j = 0; j < NUM_BANKS; j++) 214 | { /* For all banks on the channel.. */ 215 | if (recent_colacc[channel][i][j]) 216 | { /* See if this bank is a candidate. */ 217 | if (is_precharge_allowed (channel, i, j)) 218 | { /* See if precharge is doable. */ 219 | if (issue_precharge_command (channel, i, j)) 220 | { 221 | num_aggr_precharge++; 222 | recent_colacc[channel][i][j] = 0; 223 | } 224 | } 225 | } 226 | } 227 | } 228 | } 229 | 230 | long long total_accesses = 0; 231 | long long total_hits = 0; 232 | // update priorities 233 | for (int core = 0; core < MAX_THREADS; core++) 234 | { 235 | total_accesses += accesses[channel][core]; 236 | total_hits += hits[channel][core]; 237 | } 238 | 239 | for (int core = 0; core < MAX_THREADS; core++) 240 | { 241 | if (total_hits && total_accesses) 242 | priority[channel][core] = hits[channel][core] / total_hits + accesses[channel][core] / total_accesses; 243 | } 244 | 245 | } 246 | 247 | void 248 | scheduler_stats () 249 | { 250 | /* Nothing to print for now. */ 251 | printf ("Number of aggressive precharges: %lld\n", num_aggr_precharge); 252 | } 253 | -------------------------------------------------------------------------------- /src/scheduler.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include "utlist.h" 3 | #include "utils.h" 4 | 5 | #include "memory_controller.h" 6 | 7 | extern long long int CYCLE_VAL; 8 | 9 | void 10 | init_scheduler_vars () 11 | { 12 | // initialize all scheduler variables here 13 | 14 | return; 15 | } 16 | 17 | // write queue high water mark; begin draining writes if write queue exceeds this value 18 | #define HI_WM 40 19 | 20 | // end write queue drain once write queue has this many writes in it 21 | #define LO_WM 20 22 | 23 | // 1 means we are in write-drain mode for that channel 24 | int drain_writes[MAX_NUM_CHANNELS]; 25 | 26 | /* Each cycle it is possible to issue a valid command from the read or write queues 27 | OR 28 | a valid precharge command to any bank (issue_precharge_command()) 29 | OR 30 | a valid precharge_all bank command to a rank (issue_all_bank_precharge_command()) 31 | OR 32 | a power_down command (issue_powerdown_command()), programmed either for fast or slow exit mode 33 | OR 34 | a refresh command (issue_refresh_command()) 35 | OR 36 | a power_up command (issue_powerup_command()) 37 | OR 38 | an activate to a specific row (issue_activate_command()). 39 | 40 | If a COL-RD or COL-WR is picked for issue, the scheduler also has the 41 | option to issue an auto-precharge in this cycle (issue_autoprecharge()). 42 | 43 | Before issuing a command it is important to check if it is issuable. For the RD/WR queue resident commands, checking the "command_issuable" flag is necessary. To check if the other commands (mentioned above) can be issued, it is important to check one of the following functions: is_precharge_allowed, is_all_bank_precharge_allowed, is_powerdown_fast_allowed, is_powerdown_slow_allowed, is_powerup_allowed, is_refresh_allowed, is_autoprecharge_allowed, is_activate_allowed. 44 | */ 45 | 46 | void 47 | schedule (int channel) 48 | { 49 | request_t *rd_ptr = NULL; 50 | request_t *wr_ptr = NULL; 51 | 52 | 53 | // if in write drain mode, keep draining writes until the 54 | // write queue occupancy drops to LO_WM 55 | if (drain_writes[channel] && (write_queue_length[channel] > LO_WM)) 56 | { 57 | drain_writes[channel] = 1; // Keep draining. 58 | } 59 | else 60 | { 61 | drain_writes[channel] = 0; // No need to drain. 62 | } 63 | 64 | // initiate write drain if either the write queue occupancy 65 | // has reached the HI_WM , OR, if there are no pending read 66 | // requests 67 | if (write_queue_length[channel] > HI_WM) 68 | { 69 | drain_writes[channel] = 1; 70 | } 71 | else 72 | { 73 | if (!read_queue_length[channel]) 74 | drain_writes[channel] = 1; 75 | } 76 | 77 | 78 | // If in write drain mode, look through all the write queue 79 | // elements (already arranged in the order of arrival), and 80 | // issue the command for the first request that is ready 81 | if (drain_writes[channel]) 82 | { 83 | 84 | LL_FOREACH (write_queue_head[channel], wr_ptr) 85 | { 86 | if (wr_ptr->command_issuable) 87 | { 88 | issue_request_command (wr_ptr); 89 | break; 90 | } 91 | } 92 | return; 93 | } 94 | 95 | // Draining Reads 96 | // look through the queue and find the first request whose 97 | // command can be issued in this cycle and issue it 98 | // Simple FCFS 99 | if (!drain_writes[channel]) 100 | { 101 | LL_FOREACH (read_queue_head[channel], rd_ptr) 102 | { 103 | if (rd_ptr->command_issuable) 104 | { 105 | issue_request_command (rd_ptr); 106 | break; 107 | } 108 | } 109 | return; 110 | } 111 | } 112 | 113 | void 114 | scheduler_stats () 115 | { 116 | /* Nothing to print for now. */ 117 | } 118 | -------------------------------------------------------------------------------- /src/scheduler.h: -------------------------------------------------------------------------------- 1 | #ifndef __SCHEDULER_H__ 2 | #define __SCHEDULER_H__ 3 | 4 | void init_scheduler_vars(); //called from main 5 | void scheduler_stats(); //called from main 6 | void schedule(int); // scheduler function called every cycle 7 | 8 | #endif //__SCHEDULER_H__ 9 | 10 | -------------------------------------------------------------------------------- /src/utils.h: -------------------------------------------------------------------------------- 1 | #ifndef __UTILS_H__ 2 | #define __UTILS_H__ 3 | 4 | // Utility functions 5 | 6 | // Turn on the following flag to see debug messages 7 | //#define CMD_DEBUG 8 | //#define SCHEDULER_DEBUG 9 | 10 | #ifdef CMD_DEBUG 11 | #define UT_MEM_DEBUG(...) printf(__VA_ARGS__) 12 | #else 13 | #define UT_MEM_DEBUG(...) 14 | #endif 15 | 16 | 17 | #define SCHEDULER_DEBUG 18 | #ifdef SCHEDULER_DEBUG 19 | #define SCHEDELUR_DEBUG_MSG(...) printf(__VA_ARGS__) 20 | #else 21 | #define SCHEDULER_DEBUG_MSG(...) 22 | #endif 23 | 24 | #endif // __UTILS_H__ 25 | 26 | -------------------------------------------------------------------------------- /src/utlist.h: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright (c) 2007-2011, Troy D. Hanson http://uthash.sourceforge.net 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without 6 | modification, are permitted provided that the following conditions are met: 7 | 8 | * Redistributions of source code must retain the above copyright 9 | notice, this list of conditions and the following disclaimer. 10 | 11 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS 12 | IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED 13 | TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A 14 | PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER 15 | OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, 16 | EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 17 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR 18 | PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF 19 | LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 20 | NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 21 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 22 | */ 23 | 24 | #ifndef UTLIST_H 25 | #define UTLIST_H 26 | 27 | #define UTLIST_VERSION 1.9.4 28 | 29 | #include 30 | 31 | /* 32 | * This file contains macros to manipulate singly and doubly-linked lists. 33 | * 34 | * 1. LL_ macros: singly-linked lists. 35 | * 2. DL_ macros: doubly-linked lists. 36 | * 3. CDL_ macros: circular doubly-linked lists. 37 | * 38 | * To use singly-linked lists, your structure must have a "next" pointer. 39 | * To use doubly-linked lists, your structure must "prev" and "next" pointers. 40 | * Either way, the pointer to the head of the list must be initialized to NULL. 41 | * 42 | * ----------------.EXAMPLE ------------------------- 43 | * struct item { 44 | * int id; 45 | * struct item *prev, *next; 46 | * } 47 | * 48 | * struct item *list = NULL: 49 | * 50 | * int main() { 51 | * struct item *item; 52 | * ... allocate and populate item ... 53 | * DL_APPEND(list, item); 54 | * } 55 | * -------------------------------------------------- 56 | * 57 | * For doubly-linked lists, the append and delete macros are O(1) 58 | * For singly-linked lists, append and delete are O(n) but prepend is O(1) 59 | * The sort macro is O(n log(n)) for all types of single/double/circular lists. 60 | */ 61 | 62 | /* These macros use decltype or the earlier __typeof GNU extension. 63 | As decltype is only available in newer compilers (VS2010 or gcc 4.3+ 64 | when compiling c++ code), this code uses whatever method is needed 65 | or, for VS2008 where neither is available, uses casting workarounds. */ 66 | #ifdef _MSC_VER /* MS compiler */ 67 | #if _MSC_VER >= 1600 && defined(__cplusplus) /* VS2010 or newer in C++ mode */ 68 | #define LDECLTYPE(x) decltype(x) 69 | #else /* VS2008 or older (or VS2010 in C mode) */ 70 | #define NO_DECLTYPE 71 | #define LDECLTYPE(x) char* 72 | #endif 73 | #else /* GNU, Sun and other compilers */ 74 | #define LDECLTYPE(x) __typeof(x) 75 | #endif 76 | 77 | /* for VS2008 we use some workarounds to get around the lack of decltype, 78 | * namely, we always reassign our tmp variable to the list head if we need 79 | * to dereference its prev/next pointers, and save/restore the real head.*/ 80 | #ifdef NO_DECLTYPE 81 | #define _SV(elt,list) _tmp = (char*)(list); {char **_alias = (char**)&(list); *_alias = (elt); } 82 | #define _NEXT(elt,list) ((char*)((list)->next)) 83 | #define _NEXTASGN(elt,list,to) { char **_alias = (char**)&((list)->next); *_alias=(char*)(to); } 84 | #define _PREV(elt,list) ((char*)((list)->prev)) 85 | #define _PREVASGN(elt,list,to) { char **_alias = (char**)&((list)->prev); *_alias=(char*)(to); } 86 | #define _RS(list) { char **_alias = (char**)&(list); *_alias=_tmp; } 87 | #define _CASTASGN(a,b) { char **_alias = (char**)&(a); *_alias=(char*)(b); } 88 | #else 89 | #define _SV(elt,list) 90 | #define _NEXT(elt,list) ((elt)->next) 91 | #define _NEXTASGN(elt,list,to) ((elt)->next)=(to) 92 | #define _PREV(elt,list) ((elt)->prev) 93 | #define _PREVASGN(elt,list,to) ((elt)->prev)=(to) 94 | #define _RS(list) 95 | #define _CASTASGN(a,b) (a)=(b) 96 | #endif 97 | 98 | /****************************************************************************** 99 | * The sort macro is an adaptation of Simon Tatham's O(n log(n)) mergesort * 100 | * Unwieldy variable names used here to avoid shadowing passed-in variables. * 101 | *****************************************************************************/ 102 | #define LL_SORT(list, cmp) \ 103 | do { \ 104 | LDECLTYPE(list) _ls_p; \ 105 | LDECLTYPE(list) _ls_q; \ 106 | LDECLTYPE(list) _ls_e; \ 107 | LDECLTYPE(list) _ls_tail; \ 108 | LDECLTYPE(list) _ls_oldhead; \ 109 | LDECLTYPE(list) _tmp; \ 110 | int _ls_insize, _ls_nmerges, _ls_psize, _ls_qsize, _ls_i, _ls_looping; \ 111 | if (list) { \ 112 | _ls_insize = 1; \ 113 | _ls_looping = 1; \ 114 | while (_ls_looping) { \ 115 | _CASTASGN(_ls_p,list); \ 116 | _CASTASGN(_ls_oldhead,list); \ 117 | list = NULL; \ 118 | _ls_tail = NULL; \ 119 | _ls_nmerges = 0; \ 120 | while (_ls_p) { \ 121 | _ls_nmerges++; \ 122 | _ls_q = _ls_p; \ 123 | _ls_psize = 0; \ 124 | for (_ls_i = 0; _ls_i < _ls_insize; _ls_i++) { \ 125 | _ls_psize++; \ 126 | _SV(_ls_q,list); _ls_q = _NEXT(_ls_q,list); _RS(list); \ 127 | if (!_ls_q) break; \ 128 | } \ 129 | _ls_qsize = _ls_insize; \ 130 | while (_ls_psize > 0 || (_ls_qsize > 0 && _ls_q)) { \ 131 | if (_ls_psize == 0) { \ 132 | _ls_e = _ls_q; _SV(_ls_q,list); _ls_q = _NEXT(_ls_q,list); _RS(list); _ls_qsize--; \ 133 | } else if (_ls_qsize == 0 || !_ls_q) { \ 134 | _ls_e = _ls_p; _SV(_ls_p,list); _ls_p = _NEXT(_ls_p,list); _RS(list); _ls_psize--; \ 135 | } else if (cmp(_ls_p,_ls_q) <= 0) { \ 136 | _ls_e = _ls_p; _SV(_ls_p,list); _ls_p = _NEXT(_ls_p,list); _RS(list); _ls_psize--; \ 137 | } else { \ 138 | _ls_e = _ls_q; _SV(_ls_q,list); _ls_q = _NEXT(_ls_q,list); _RS(list); _ls_qsize--; \ 139 | } \ 140 | if (_ls_tail) { \ 141 | _SV(_ls_tail,list); _NEXTASGN(_ls_tail,list,_ls_e); _RS(list); \ 142 | } else { \ 143 | _CASTASGN(list,_ls_e); \ 144 | } \ 145 | _ls_tail = _ls_e; \ 146 | } \ 147 | _ls_p = _ls_q; \ 148 | } \ 149 | _SV(_ls_tail,list); _NEXTASGN(_ls_tail,list,NULL); _RS(list); \ 150 | if (_ls_nmerges <= 1) { \ 151 | _ls_looping=0; \ 152 | } \ 153 | _ls_insize *= 2; \ 154 | } \ 155 | } else _tmp=NULL; /* quiet gcc unused variable warning */ \ 156 | } while (0) 157 | 158 | #define DL_SORT(list, cmp) \ 159 | do { \ 160 | LDECLTYPE(list) _ls_p; \ 161 | LDECLTYPE(list) _ls_q; \ 162 | LDECLTYPE(list) _ls_e; \ 163 | LDECLTYPE(list) _ls_tail; \ 164 | LDECLTYPE(list) _ls_oldhead; \ 165 | LDECLTYPE(list) _tmp; \ 166 | int _ls_insize, _ls_nmerges, _ls_psize, _ls_qsize, _ls_i, _ls_looping; \ 167 | if (list) { \ 168 | _ls_insize = 1; \ 169 | _ls_looping = 1; \ 170 | while (_ls_looping) { \ 171 | _CASTASGN(_ls_p,list); \ 172 | _CASTASGN(_ls_oldhead,list); \ 173 | list = NULL; \ 174 | _ls_tail = NULL; \ 175 | _ls_nmerges = 0; \ 176 | while (_ls_p) { \ 177 | _ls_nmerges++; \ 178 | _ls_q = _ls_p; \ 179 | _ls_psize = 0; \ 180 | for (_ls_i = 0; _ls_i < _ls_insize; _ls_i++) { \ 181 | _ls_psize++; \ 182 | _SV(_ls_q,list); _ls_q = _NEXT(_ls_q,list); _RS(list); \ 183 | if (!_ls_q) break; \ 184 | } \ 185 | _ls_qsize = _ls_insize; \ 186 | while (_ls_psize > 0 || (_ls_qsize > 0 && _ls_q)) { \ 187 | if (_ls_psize == 0) { \ 188 | _ls_e = _ls_q; _SV(_ls_q,list); _ls_q = _NEXT(_ls_q,list); _RS(list); _ls_qsize--; \ 189 | } else if (_ls_qsize == 0 || !_ls_q) { \ 190 | _ls_e = _ls_p; _SV(_ls_p,list); _ls_p = _NEXT(_ls_p,list); _RS(list); _ls_psize--; \ 191 | } else if (cmp(_ls_p,_ls_q) <= 0) { \ 192 | _ls_e = _ls_p; _SV(_ls_p,list); _ls_p = _NEXT(_ls_p,list); _RS(list); _ls_psize--; \ 193 | } else { \ 194 | _ls_e = _ls_q; _SV(_ls_q,list); _ls_q = _NEXT(_ls_q,list); _RS(list); _ls_qsize--; \ 195 | } \ 196 | if (_ls_tail) { \ 197 | _SV(_ls_tail,list); _NEXTASGN(_ls_tail,list,_ls_e); _RS(list); \ 198 | } else { \ 199 | _CASTASGN(list,_ls_e); \ 200 | } \ 201 | _SV(_ls_e,list); _PREVASGN(_ls_e,list,_ls_tail); _RS(list); \ 202 | _ls_tail = _ls_e; \ 203 | } \ 204 | _ls_p = _ls_q; \ 205 | } \ 206 | _CASTASGN(list->prev, _ls_tail); \ 207 | _SV(_ls_tail,list); _NEXTASGN(_ls_tail,list,NULL); _RS(list); \ 208 | if (_ls_nmerges <= 1) { \ 209 | _ls_looping=0; \ 210 | } \ 211 | _ls_insize *= 2; \ 212 | } \ 213 | } else _tmp=NULL; /* quiet gcc unused variable warning */ \ 214 | } while (0) 215 | 216 | #define CDL_SORT(list, cmp) \ 217 | do { \ 218 | LDECLTYPE(list) _ls_p; \ 219 | LDECLTYPE(list) _ls_q; \ 220 | LDECLTYPE(list) _ls_e; \ 221 | LDECLTYPE(list) _ls_tail; \ 222 | LDECLTYPE(list) _ls_oldhead; \ 223 | LDECLTYPE(list) _tmp; \ 224 | LDECLTYPE(list) _tmp2; \ 225 | int _ls_insize, _ls_nmerges, _ls_psize, _ls_qsize, _ls_i, _ls_looping; \ 226 | if (list) { \ 227 | _ls_insize = 1; \ 228 | _ls_looping = 1; \ 229 | while (_ls_looping) { \ 230 | _CASTASGN(_ls_p,list); \ 231 | _CASTASGN(_ls_oldhead,list); \ 232 | list = NULL; \ 233 | _ls_tail = NULL; \ 234 | _ls_nmerges = 0; \ 235 | while (_ls_p) { \ 236 | _ls_nmerges++; \ 237 | _ls_q = _ls_p; \ 238 | _ls_psize = 0; \ 239 | for (_ls_i = 0; _ls_i < _ls_insize; _ls_i++) { \ 240 | _ls_psize++; \ 241 | _SV(_ls_q,list); \ 242 | if (_NEXT(_ls_q,list) == _ls_oldhead) { \ 243 | _ls_q = NULL; \ 244 | } else { \ 245 | _ls_q = _NEXT(_ls_q,list); \ 246 | } \ 247 | _RS(list); \ 248 | if (!_ls_q) break; \ 249 | } \ 250 | _ls_qsize = _ls_insize; \ 251 | while (_ls_psize > 0 || (_ls_qsize > 0 && _ls_q)) { \ 252 | if (_ls_psize == 0) { \ 253 | _ls_e = _ls_q; _SV(_ls_q,list); _ls_q = _NEXT(_ls_q,list); _RS(list); _ls_qsize--; \ 254 | if (_ls_q == _ls_oldhead) { _ls_q = NULL; } \ 255 | } else if (_ls_qsize == 0 || !_ls_q) { \ 256 | _ls_e = _ls_p; _SV(_ls_p,list); _ls_p = _NEXT(_ls_p,list); _RS(list); _ls_psize--; \ 257 | if (_ls_p == _ls_oldhead) { _ls_p = NULL; } \ 258 | } else if (cmp(_ls_p,_ls_q) <= 0) { \ 259 | _ls_e = _ls_p; _SV(_ls_p,list); _ls_p = _NEXT(_ls_p,list); _RS(list); _ls_psize--; \ 260 | if (_ls_p == _ls_oldhead) { _ls_p = NULL; } \ 261 | } else { \ 262 | _ls_e = _ls_q; _SV(_ls_q,list); _ls_q = _NEXT(_ls_q,list); _RS(list); _ls_qsize--; \ 263 | if (_ls_q == _ls_oldhead) { _ls_q = NULL; } \ 264 | } \ 265 | if (_ls_tail) { \ 266 | _SV(_ls_tail,list); _NEXTASGN(_ls_tail,list,_ls_e); _RS(list); \ 267 | } else { \ 268 | _CASTASGN(list,_ls_e); \ 269 | } \ 270 | _SV(_ls_e,list); _PREVASGN(_ls_e,list,_ls_tail); _RS(list); \ 271 | _ls_tail = _ls_e; \ 272 | } \ 273 | _ls_p = _ls_q; \ 274 | } \ 275 | _CASTASGN(list->prev,_ls_tail); \ 276 | _CASTASGN(_tmp2,list); \ 277 | _SV(_ls_tail,list); _NEXTASGN(_ls_tail,list,_tmp2); _RS(list); \ 278 | if (_ls_nmerges <= 1) { \ 279 | _ls_looping=0; \ 280 | } \ 281 | _ls_insize *= 2; \ 282 | } \ 283 | } else _tmp=NULL; /* quiet gcc unused variable warning */ \ 284 | } while (0) 285 | 286 | /****************************************************************************** 287 | * singly linked list macros (non-circular) * 288 | *****************************************************************************/ 289 | #define LL_PREPEND(head,add) \ 290 | do { \ 291 | (add)->next = head; \ 292 | head = add; \ 293 | } while (0) 294 | 295 | #define LL_CONCAT(head1,head2) \ 296 | do { \ 297 | LDECLTYPE(head1) _tmp; \ 298 | if (head1) { \ 299 | _tmp = head1; \ 300 | while (_tmp->next) { _tmp = _tmp->next; } \ 301 | _tmp->next=(head2); \ 302 | } else { \ 303 | (head1)=(head2); \ 304 | } \ 305 | } while (0) 306 | 307 | #define LL_APPEND(head,add) \ 308 | do { \ 309 | LDECLTYPE(head) _tmp; \ 310 | (add)->next=NULL; \ 311 | if (head) { \ 312 | _tmp = head; \ 313 | while (_tmp->next) { _tmp = _tmp->next; } \ 314 | _tmp->next=(add); \ 315 | } else { \ 316 | (head)=(add); \ 317 | } \ 318 | } while (0) 319 | 320 | #define LL_DELETE(head,del) \ 321 | do { \ 322 | LDECLTYPE(head) _tmp; \ 323 | if ((head) == (del)) { \ 324 | (head)=(head)->next; \ 325 | } else { \ 326 | _tmp = head; \ 327 | while (_tmp->next && (_tmp->next != (del))) { \ 328 | _tmp = _tmp->next; \ 329 | } \ 330 | if (_tmp->next) { \ 331 | _tmp->next = ((del)->next); \ 332 | } \ 333 | } \ 334 | } while (0) 335 | 336 | /* Here are VS2008 replacements for LL_APPEND and LL_DELETE */ 337 | #define LL_APPEND_VS2008(head,add) \ 338 | do { \ 339 | if (head) { \ 340 | (add)->next = head; /* use add->next as a temp variable */ \ 341 | while ((add)->next->next) { (add)->next = (add)->next->next; } \ 342 | (add)->next->next=(add); \ 343 | } else { \ 344 | (head)=(add); \ 345 | } \ 346 | (add)->next=NULL; \ 347 | } while (0) 348 | 349 | #define LL_DELETE_VS2008(head,del) \ 350 | do { \ 351 | if ((head) == (del)) { \ 352 | (head)=(head)->next; \ 353 | } else { \ 354 | char *_tmp = (char*)(head); \ 355 | while (head->next && (head->next != (del))) { \ 356 | head = head->next; \ 357 | } \ 358 | if (head->next) { \ 359 | head->next = ((del)->next); \ 360 | } \ 361 | { \ 362 | char **_head_alias = (char**)&(head); \ 363 | *_head_alias = _tmp; \ 364 | } \ 365 | } \ 366 | } while (0) 367 | #ifdef NO_DECLTYPE 368 | #undef LL_APPEND 369 | #define LL_APPEND LL_APPEND_VS2008 370 | #undef LL_DELETE 371 | #define LL_DELETE LL_DELETE_VS2008 372 | #undef LL_CONCAT /* no LL_CONCAT_VS2008 */ 373 | #undef DL_CONCAT /* no DL_CONCAT_VS2008 */ 374 | #endif 375 | /* end VS2008 replacements */ 376 | 377 | #define LL_FOREACH(head,el) \ 378 | for(el=head;el;el=el->next) 379 | 380 | #define LL_FOREACH_SAFE(head,el,tmp) \ 381 | for((el)=(head);(el) && (tmp = (el)->next, 1); (el) = tmp) 382 | 383 | #define LL_SEARCH_SCALAR(head,out,field,val) \ 384 | do { \ 385 | LL_FOREACH(head,out) { \ 386 | if ((out)->field == (val)) break; \ 387 | } \ 388 | } while(0) 389 | 390 | #define LL_SEARCH(head,out,elt,cmp) \ 391 | do { \ 392 | LL_FOREACH(head,out) { \ 393 | if ((cmp(out,elt))==0) break; \ 394 | } \ 395 | } while(0) 396 | 397 | /****************************************************************************** 398 | * doubly linked list macros (non-circular) * 399 | *****************************************************************************/ 400 | #define DL_PREPEND(head,add) \ 401 | do { \ 402 | (add)->next = head; \ 403 | if (head) { \ 404 | (add)->prev = (head)->prev; \ 405 | (head)->prev = (add); \ 406 | } else { \ 407 | (add)->prev = (add); \ 408 | } \ 409 | (head) = (add); \ 410 | } while (0) 411 | 412 | #define DL_APPEND(head,add) \ 413 | do { \ 414 | if (head) { \ 415 | (add)->prev = (head)->prev; \ 416 | (head)->prev->next = (add); \ 417 | (head)->prev = (add); \ 418 | (add)->next = NULL; \ 419 | } else { \ 420 | (head)=(add); \ 421 | (head)->prev = (head); \ 422 | (head)->next = NULL; \ 423 | } \ 424 | } while (0); 425 | 426 | #define DL_CONCAT(head1,head2) \ 427 | do { \ 428 | LDECLTYPE(head1) _tmp; \ 429 | if (head2) { \ 430 | if (head1) { \ 431 | _tmp = (head2)->prev; \ 432 | (head2)->prev = (head1)->prev; \ 433 | (head1)->prev->next = (head2); \ 434 | (head1)->prev = _tmp; \ 435 | } else { \ 436 | (head1)=(head2); \ 437 | } \ 438 | } \ 439 | } while (0); 440 | 441 | #define DL_DELETE(head,del) \ 442 | do { \ 443 | assert((del)->prev != NULL); \ 444 | if ((del)->prev == (del)) { \ 445 | (head)=NULL; \ 446 | } else if ((del)==(head)) { \ 447 | (del)->next->prev = (del)->prev; \ 448 | (head) = (del)->next; \ 449 | } else { \ 450 | (del)->prev->next = (del)->next; \ 451 | if ((del)->next) { \ 452 | (del)->next->prev = (del)->prev; \ 453 | } else { \ 454 | (head)->prev = (del)->prev; \ 455 | } \ 456 | } \ 457 | } while (0); 458 | 459 | 460 | #define DL_FOREACH(head,el) \ 461 | for(el=head;el;el=el->next) 462 | 463 | /* this version is safe for deleting the elements during iteration */ 464 | #define DL_FOREACH_SAFE(head,el,tmp) \ 465 | for((el)=(head);(el) && (tmp = (el)->next, 1); (el) = tmp) 466 | 467 | /* these are identical to their singly-linked list counterparts */ 468 | #define DL_SEARCH_SCALAR LL_SEARCH_SCALAR 469 | #define DL_SEARCH LL_SEARCH 470 | 471 | /****************************************************************************** 472 | * circular doubly linked list macros * 473 | *****************************************************************************/ 474 | #define CDL_PREPEND(head,add) \ 475 | do { \ 476 | if (head) { \ 477 | (add)->prev = (head)->prev; \ 478 | (add)->next = (head); \ 479 | (head)->prev = (add); \ 480 | (add)->prev->next = (add); \ 481 | } else { \ 482 | (add)->prev = (add); \ 483 | (add)->next = (add); \ 484 | } \ 485 | (head)=(add); \ 486 | } while (0) 487 | 488 | #define CDL_DELETE(head,del) \ 489 | do { \ 490 | if ( ((head)==(del)) && ((head)->next == (head))) { \ 491 | (head) = 0L; \ 492 | } else { \ 493 | (del)->next->prev = (del)->prev; \ 494 | (del)->prev->next = (del)->next; \ 495 | if ((del) == (head)) (head)=(del)->next; \ 496 | } \ 497 | } while (0); 498 | 499 | #define CDL_FOREACH(head,el) \ 500 | for(el=head;el;el=(el->next==head ? 0L : el->next)) 501 | 502 | #define CDL_FOREACH_SAFE(head,el,tmp1,tmp2) \ 503 | for((el)=(head), ((tmp1)=(head)?((head)->prev):NULL); \ 504 | (el) && ((tmp2)=(el)->next, 1); \ 505 | ((el) = (((el)==(tmp1)) ? 0L : (tmp2)))) 506 | 507 | #define CDL_SEARCH_SCALAR(head,out,field,val) \ 508 | do { \ 509 | CDL_FOREACH(head,out) { \ 510 | if ((out)->field == (val)) break; \ 511 | } \ 512 | } while(0) 513 | 514 | #define CDL_SEARCH(head,out,elt,cmp) \ 515 | do { \ 516 | CDL_FOREACH(head,out) { \ 517 | if ((cmp(out,elt))==0) break; \ 518 | } \ 519 | } while(0) 520 | 521 | #endif /* UTLIST_H */ 522 | 523 | -------------------------------------------------------------------------------- /usimm-script.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl -w 2 | 3 | #----------------------------------------------------------- 4 | # Description: 5 | # ------------ 6 | # This script parses the files mentioned in the runsim script 7 | # and creates a csv with all the relevant information of all 8 | # the simulations 9 | # 10 | # Inputs: 11 | # ------- 12 | # 1) usimm-script.pl takes the name of the USIMM runscript as the only argument 13 | # 2) The outputs produced by the runscript should be in the output/ 14 | # directory 15 | # 3) The output files must follow the following naming convention 16 | # -...-<>-_ 17 | # 4) Usage: > cd 18 | # > ./usimm-script.pl -runscript 19 | # 20 | # Outputs: 21 | # -------- 22 | # output/stats.csv 23 | # Final Metric numbers are also printed on Stdout 24 | # 25 | # Other Notes: 26 | # ------------ 27 | # 1) Make sure all your single thread bcmks have "$single_thread_time" 28 | # (The times in this usimm-script.pl script represent single thread behavior 29 | # with an FCFS scheduler, and will be the ones used for the MSC) 30 | # 2) $mt includes the programs excluded from the fairness calculations 31 | # 32 | # Written by Manjunath Shevgoor, shevgoor@cs.utah.edu 33 | #----------------------------------------------------------- 34 | use strict; 35 | 36 | my %single_thread_time; 37 | 38 | # Benchmark Naming convention 39 | # $single_thread_time{} 40 | 41 | $single_thread_time{bl1}=318150748; 42 | $single_thread_time{bo1}=293623201; 43 | $single_thread_time{ca1}=465074385; 44 | $single_thread_time{fa1}=404645160; 45 | $single_thread_time{fe1}=379065129; 46 | $single_thread_time{fr1}=305902869; 47 | $single_thread_time{ra1}=319983309; 48 | $single_thread_time{st1}=320441340; 49 | $single_thread_time{vi1}=325420205; 50 | $single_thread_time{x21}=332000385; 51 | $single_thread_time{c11}=372897100; 52 | $single_thread_time{c21}=442948245; 53 | $single_thread_time{fl1}=468052997; 54 | $single_thread_time{sw1}=474243253; 55 | 56 | 57 | $single_thread_time{bl4}=187992840; 58 | $single_thread_time{bo4}=167672553; 59 | $single_thread_time{ca4}=300787617; 60 | $single_thread_time{fa4}=210337888; 61 | $single_thread_time{fe4}=232711401; 62 | $single_thread_time{fr4}=174316754; 63 | $single_thread_time{ra4}=186816805; 64 | $single_thread_time{st4}=188074168; 65 | $single_thread_time{vi4}=192811301; 66 | $single_thread_time{x24}=197657337; 67 | $single_thread_time{c14}=244419708; 68 | $single_thread_time{c24}=303069945; 69 | $single_thread_time{fl4}=275488665; 70 | $single_thread_time{sw4}=276348929; 71 | 72 | # Multi threaded benchmarks and single programmed workloads are being excluded 73 | # from the fairness calculations. 74 | # Slowdown cannot be calculated for these 75 | my %mt; 76 | $mt{MTc} =1; 77 | $mt{c2} =1; 78 | 79 | #----------------------------------------------------------- 80 | #Get Options 81 | use Getopt::Long; 82 | my ($ret, $help); 83 | my $runscript; 84 | 85 | $ret = Getopt::Long::GetOptions ( 86 | 87 | "runscript|runsim:s" => \$runscript, 88 | "help|h:s" => \$help 89 | ) ; 90 | 91 | 92 | 93 | if( !(defined $runscript)) { 94 | print STDERR "Warning: USIMM runscript not specified. Using the default ./runsim\n"; 95 | $runscript= "runsim"; 96 | print STDERR "Usage: $0 -runscript \n\n"; 97 | } 98 | 99 | if (defined $help) { 100 | print STDERR "Usage: $0 -runscript \n\n"; 101 | exit; 102 | } 103 | #----------------------------------------------------------- 104 | # Check if the script is being run in the correct directory 105 | if (! -d "./output") { 106 | die ("ERROR: ./output does not exis.. Exiting"); 107 | } 108 | 109 | # Check if the runcript exists in the location specified by the user 110 | if (! -e "$runscript") { 111 | die ("ERROR: $runscript does not exis.. Exiting"); 112 | } 113 | 114 | #----------------------------------------------------------- 115 | # Parse the runscript to get the names of all the output files 116 | # The runscript is assumed to be a shell script where the output 117 | # is dumped using the ">" operatior 118 | # All files generated by the ">" operator in the runscript will be 119 | # assumed to be valid usimm outputs 120 | 121 | open (RUNSIM_FH, "<$runscript") or die "Error: Can't open $runscript\n"; 122 | 123 | # This variable keeps track of the max number of programs in each workload 124 | # This is needed to align the columns in the .csv 125 | my $max_progs=0; 126 | 127 | my $line; 128 | 129 | #This variable will have all the filenames that will parsed as valid USIMM 130 | #outputs 131 | my @outputs; 132 | 133 | while () { 134 | chomp; 135 | $line=$_; 136 | if ($line=~/>/) { 137 | chomp; 138 | $line=~s/.*>\s*output\///; 139 | $line=~s/\s*&//; 140 | @outputs[scalar @outputs] = $line; 141 | 142 | my @benchmarks= split (/-/,$line); 143 | my $file_len = scalar @benchmarks -1; 144 | $max_progs= $file_len if ($max_progs < $file_len); 145 | } 146 | } 147 | close RUNSIM_FH; 148 | 149 | #----------------------------------------------------------- 150 | 151 | my $progs=0; 152 | 153 | 154 | 155 | #my @outputs = ( "c2-1", "c1-c1-1", "bl-bl-fr-fr-1", "c2-4", "st-st-st-st-1", "c1-c1-4", "fa-fa-fe-fe-1", "c1-c1-c2-c2-1", "bl-bl-fr-fr-4", "st-st-st-st-4", "c1-c1-c2-c2-4", "fa-fa-fe-fe-4", "MTc-1", "MTc-4", "fl-sw-c2-c2-1", "fl-sw-c2-c2-4", "fl-fl-sw-sw-c2-c2-fe-fe-4", "fl-fl-sw-sw-c2-c2-fe-fe-bl-bl-fr-fr-c1-c1-st-st-4" ); 156 | 157 | 158 | # This is the file where the output will be writted to in the output directory 159 | my $out_file = "stats.csv"; 160 | 161 | print "INFO: Writing output to output/$out_file\n"; 162 | open (OUT_FH, ">output/$out_file") or die "Error: Can't open output/$out_file\n"; 163 | 164 | print OUT_FH "Workload,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,Execution Time,,Max SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown, SlowDown...\n"; 165 | 166 | my $infile; 167 | my %bcmk; 168 | my %time; 169 | my %slowdown; 170 | my %max_slowdown; 171 | my $total_execution_time=0; 172 | my $pfp_total_execution_time=0; 173 | my %edp; 174 | my $total_edp=0; 175 | my $i; 176 | 177 | # Start reading USIMM output files 178 | foreach $infile (@outputs) { 179 | print "INFO: Opening file output/$infile\n"; 180 | open (IN_FH, ") { 193 | chomp; 194 | $line= $_; 195 | 196 | 197 | 198 | if ($line =~/^Done: Core/) { 199 | #time {inFile} {core} 200 | $time{$infile}{$n} = $line; 201 | $time{$infile}{$n} =~s/.* //; 202 | 203 | 204 | $total_execution_time += $time{$infile}{$n}; 205 | $pfp_total_execution_time += $time{$infile}{$n} if (!exists $mt{$file_name}); 206 | 207 | $bcmk{$infile}{$n} = $bcmks[$n]; 208 | 209 | # If filename exists in the $mt hash, then it is multi-threaded and will not have a 210 | # single threaded exec time 211 | if (!exists $mt{$file_name}) { 212 | my $single_thread_key= "$bcmks[$n]"."$current_channel"; 213 | if (exists $single_thread_time{$single_thread_key}) { 214 | $slowdown{$infile}{$n} = $time{$infile}{$n}/$single_thread_time{$single_thread_key}; 215 | } else { 216 | die "ERROR: single_thread_time{$single_thread_key} does not exist n=$n... Exiting"; 217 | } 218 | } else { 219 | # SLowdown is marked negative to mark invalid 220 | # This condition is checked each time $slowdown is used 221 | $slowdown{$infile}{$n} = -1; 222 | } 223 | 224 | # Find max slowdown 225 | if (! exists $max_slowdown{$infile}) { 226 | if ($slowdown{$infile}{$n} > 0) { 227 | $max_slowdown{$infile} = $slowdown{$infile}{$n}; 228 | } else { 229 | $max_slowdown{$infile} = 0; 230 | } 231 | } elsif ($max_slowdown{$infile} < $slowdown{$infile}{$n} ){ 232 | $max_slowdown{$infile} = $slowdown{$infile}{$n}; 233 | } 234 | 235 | $n++; 236 | 237 | } elsif ($line =~/Energy Delay product \(EDP\) = (\S+) J.s/) { 238 | $edp{$infile} = $1; 239 | $total_edp += $1; 240 | } 241 | } 242 | 243 | close IN_FH; 244 | } 245 | 246 | 247 | 248 | my $n=0; 249 | my $total_max_slowdown=0; 250 | my $l_slowdown=0; 251 | my $avg_max_slowdown; 252 | 253 | my $workload; 254 | my $core; 255 | my $num_non_mt_workload=0; 256 | 257 | 258 | $n=0; 259 | # The completion time of MultiThreaded workloads is not considered while 260 | # calculating the PFP metrics 261 | 262 | foreach $workload (sort keys %slowdown) { 263 | $n++; 264 | my $threads= $workload; 265 | $threads=~s/-\d+$//; 266 | if ( ! exists $mt{$threads}) { 267 | $num_non_mt_workload++ ; 268 | } 269 | $total_max_slowdown+=$max_slowdown{$workload} ; 270 | } 271 | $avg_max_slowdown = $total_max_slowdown/$n; 272 | my $pfp_avg_max_slowdown = $total_max_slowdown/$num_non_mt_workload; 273 | 274 | my $pfp = $pfp_total_execution_time * $pfp_avg_max_slowdown; 275 | 276 | #Start Generating the csv file: 277 | foreach $workload (sort keys %time) { 278 | $progs=0; 279 | print OUT_FH "$workload"; 280 | foreach $core (sort {$a<=> $b} keys %{$time{$workload}}){ 281 | print OUT_FH ",$time{$workload}{$core}"; 282 | $progs++; 283 | } 284 | for ($i=$progs; $i<= $max_progs; $i++) { 285 | print OUT_FH ","; 286 | } 287 | 288 | print OUT_FH ",$max_slowdown{$workload}"; 289 | $progs=0; 290 | foreach $core (sort {$a<=> $b} keys %{$time{$workload}}){ 291 | if ($slowdown{$workload}{$core} > 0) { 292 | print OUT_FH ",$slowdown{$workload}{$core}"; 293 | } else { 294 | print OUT_FH ","; 295 | } 296 | $progs++; 297 | } 298 | for ($i=$progs; $i<= $max_progs; $i++) { 299 | print OUT_FH ","; 300 | } 301 | print OUT_FH "\n"; 302 | 303 | } 304 | 305 | print OUT_FH "\n\nTotal Exection Time, $total_execution_time\n"; 306 | print OUT_FH "PFP Avg Max Slowdown, $pfp_avg_max_slowdown\n"; 307 | print OUT_FH "PFP, $pfp\n\n\n"; 308 | 309 | #Print the EDP stats into the CSV 310 | 311 | print OUT_FH "Work Load, EDP\n"; 312 | foreach $workload (sort keys %edp) { 313 | print OUT_FH "$workload, $edp{$workload}\n"; 314 | } 315 | print OUT_FH "Total, $total_edp\n"; 316 | 317 | print "#----------------------------------------------\n"; 318 | print "Total_execution_time = $total_execution_time\n"; 319 | print "PFP Total_execution_time = $pfp_total_execution_time\n"; 320 | print "PFP = $pfp\n"; 321 | print "PFP Average Max Slowdown = $pfp_avg_max_slowdown\n"; 322 | print "Total EDP = $total_edp\n"; 323 | print "#----------------------------------------------\n"; 324 | 325 | 326 | close OUT_FH; 327 | -------------------------------------------------------------------------------- /usimm.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pranith/usimm/42256921172845fcda70ded3d0a27c106c92ad25/usimm.pdf --------------------------------------------------------------------------------