├── .gitignore ├── LICENSE.TXT ├── Makefile ├── README.md ├── example ├── gps.csv ├── gps_no_header.csv ├── gps_output.csv ├── gps_semicolon.csv ├── gps_timestamp_format.csv ├── traj.csv ├── traj_format.csv ├── traj_input.csv └── traj_timestamp.csv ├── gps2traj.cpp └── traj2gps.cpp /.gitignore: -------------------------------------------------------------------------------- 1 | # binary and dist folder 2 | bin/ 3 | dist/ 4 | 5 | # Prerequisites 6 | *.d 7 | 8 | # Compiled Object files 9 | *.slo 10 | *.lo 11 | *.o 12 | *.obj 13 | 14 | # Precompiled Headers 15 | *.gch 16 | *.pch 17 | 18 | # Compiled Dynamic libraries 19 | *.so 20 | *.dylib 21 | *.dll 22 | 23 | # Fortran module files 24 | *.mod 25 | *.smod 26 | 27 | # Compiled Static libraries 28 | *.lai 29 | *.la 30 | *.a 31 | *.lib 32 | 33 | # Executables 34 | *.exe 35 | *.out 36 | *.app 37 | -------------------------------------------------------------------------------- /LICENSE.TXT: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Can Yang 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | build:init 2 | g++ -O3 -std=c++11 gps2traj.cpp -o bin/gps2traj 3 | g++ -O3 -std=c++11 traj2gps.cpp -o bin/traj2gps 4 | init: 5 | mkdir -p bin 6 | install: 7 | cp bin/gps2traj /usr/local/bin 8 | cp bin/traj2gps /usr/local/bin 9 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ### gps2traj 2 | 3 | `gps2traj` is a command line tool to convert GPS data (each row is a point containing id, x, y, timestamp fields) to trajectory format (each row is a LineString/trip). Both input and output are in CSV format. 4 | 5 | The tool will partition GPS points according to id field and sort by timestamp. 6 | 7 | ``` 8 | # Input 9 | id,x,y,timestamp 10 | 1,0,0,3 11 | 1,0,1,2 12 | 1,1,1,1 13 | 14 | # Output 15 | index;id;geom 16 | 1;1;LineString(1 1,0 1,0 0) 17 | ``` 18 | 19 | Test on a CSV file with 30 million rows takes about 84.797 seconds. 20 | 21 | #### Usage of gps2traj 22 | 23 | - `-i/--input`: input file 24 | - `-o/--output`: output file 25 | - `-d/--delim`: delimiter character (default `,`) 26 | - `--id`: id column name (default `id`) 27 | - `-x/--x`: x column name or index (default `x`) 28 | - `-y/--y`: y column name or index (default `y`) 29 | - `-t/--time`: timestamp column name or index (default `timestamp`) 30 | - `-f/--tf`: timestamp format (default Unix timestamp, can be specified as strftime template) 31 | - `--no_header`: if specified, gps file contains no header 32 | - `--time_gap`: time gap to split too long trajectories (default 1e9) 33 | - `--dist_gap`: distance gap to split too long trajectories (default 1e9) 34 | - `--ofields`: output fields (ts,tend,timestamp) separated by , (default "") 35 | 36 | https://en.cppreference.com/w/cpp/chrono/c/strftime 37 | 38 | #### Run example 39 | 40 | ```bash 41 | cd example 42 | gps2traj -i gps.csv -o traj.csv --time_gap 10 43 | gps2traj -i gps_semicolon.csv -o traj.csv --time_gap 10 -d ';' 44 | gps2traj -i gps_no_header.csv -o traj_timestamp.csv --time_gap 10 --no_header --id 0 -x 2 -y 3 -t 1 --ofields ts,tend,timestamp 45 | gps2traj -i gps_timestamp_format.csv -x lng -y lat -f "%Y-%m-%d %H:%M:%S" --ofields timestamp -o traj_format.csv 46 | ``` 47 | 48 | ### traj2gps 49 | 50 | This tool transform a trajectory with linestring geometry into points. 51 | 52 | ``` 53 | # Input 54 | id;geom 55 | 1;LineString(1 1,0 1,0 0) 56 | 57 | # Output 58 | id;point_idx;x;y 59 | 1;0;1;1 60 | 1;1;0;1 61 | 1;2;0;0 62 | ``` 63 | 64 | #### Usage of traj2gps 65 | 66 | - `-i/--input`: input file 67 | - `-o/--output`: output file 68 | - `-d/--delim`: field delimiter character (default `;`) 69 | - `--id`: id column name (default `id`) 70 | - `-g/--geom`: geom column name or index (default `geom`) 71 | - `--no_header`: if specified, traj file contains no header 72 | 73 | #### Run example 74 | 75 | ```bash 76 | cd example 77 | traj2gps -i traj_input.csv -o gps_output.csv 78 | ``` 79 | 80 | #### Build and install 81 | 82 | Run the command in bash shell at the project folder 83 | 84 | ``` 85 | make 86 | make install 87 | ``` 88 | 89 | You may need root permission to run the second command `sudo make install` 90 | 91 | 92 | #### Dependency 93 | 94 | - Unix environment 95 | - C++11 96 | 97 | #### Contact 98 | 99 | Can Yang, Email: cyang(at)kth.se 100 | -------------------------------------------------------------------------------- /example/gps.csv: -------------------------------------------------------------------------------- 1 | id,timestamp,x,y 2 | 1,3,20,50 3 | 1,4,40,50 4 | 1,101,50,50 5 | 1,100,30,60 6 | 2,2,40,50 7 | 2,3,30,50 8 | 2,1,10,50 9 | 2,4,90,50 10 | 3,4,50,30 11 | 3,3,50,40 12 | 3,1,100,50 13 | 3,2,50,20 14 | -------------------------------------------------------------------------------- /example/gps_no_header.csv: -------------------------------------------------------------------------------- 1 | 1,3,20,50 2 | 1,4,40,50 3 | 1,101,50,50 4 | 1,100,30,60 5 | 2,2,40,50 6 | 2,3,30,50 7 | 2,1,10,50 8 | 2,4,90,50 9 | 3,4,50,30 10 | 3,3,50,40 11 | 3,1,100,50 12 | 3,2,50,20 13 | -------------------------------------------------------------------------------- /example/gps_output.csv: -------------------------------------------------------------------------------- 1 | id;point_idx;x;y 2 | 1;0;20;50 3 | 1;1;40;50 4 | 2;0;30;60 5 | 2;1;50;50 6 | 3;0;10;50 7 | 3;1;40;50 8 | 3;2;30;50 9 | 3;3;90;50 10 | 4;0;100;50 11 | 4;1;50;20 12 | 4;2;50;40 13 | 4;3;50;30 14 | -------------------------------------------------------------------------------- /example/gps_semicolon.csv: -------------------------------------------------------------------------------- 1 | id;timestamp;x;y 2 | 1;3;20;50 3 | 1;4;40;50 4 | 1;101;50;50 5 | 1;100;30;60 6 | 2;2;40;50 7 | 2;3;30;50 8 | 2;1;10;50 9 | 2;4;90;50 10 | 3;4;50;30 11 | 3;3;50;40 12 | 3;1;100;50 13 | 3;2;50;20 14 | -------------------------------------------------------------------------------- /example/gps_timestamp_format.csv: -------------------------------------------------------------------------------- 1 | id,lat,lng,timestamp 2 | 000,39.984702,116.318417,2008-10-23 02:53:04 3 | 000,39.984683,116.31845,2008-10-23 02:53:10 4 | 000,39.984686,116.318417,2008-10-23 02:53:15 5 | 000,39.984688,116.318385,2008-10-23 02:53:20 6 | 000,39.984655,116.318263,2008-10-23 02:53:25 7 | 000,39.984611,116.318026,2008-10-23 02:53:30 8 | 000,39.984608,116.317761,2008-10-23 02:53:35 9 | 000,39.984563,116.317517,2008-10-23 02:53:40 10 | 000,39.984539,116.317294,2008-10-23 02:53:45 11 | 000,39.984606,116.317065,2008-10-23 02:53:50 12 | 181,40.9166,111.713166666667,2008-03-14 03:32:03 13 | 181,40.91595,111.713516666667,2008-03-14 03:33:07 14 | 181,40.9156,111.712933333333,2008-03-14 03:34:26 15 | 181,40.9157666666667,111.712066666667,2008-03-14 03:36:49 16 | 181,40.9154333333333,111.71145,2008-03-14 03:38:04 17 | 181,40.9148666666667,111.7105,2008-03-14 03:39:56 18 | 181,40.9142666666667,111.710333333333,2008-03-14 03:41:17 19 | 181,40.9124666666667,111.710666666667,2008-03-14 03:43:02 20 | 181,40.9115166666667,111.711316666667,2008-03-14 03:43:28 21 | 181,40.9109333333333,111.711616666667,2008-03-14 03:43:40 22 | -------------------------------------------------------------------------------- /example/traj.csv: -------------------------------------------------------------------------------- 1 | index;id;geom 2 | 0;1;LineString(20 50,40 50) 3 | 0;1;LineString(30 60,50 50) 4 | 1;2;LineString(10 50,40 50,30 50,90 50) 5 | 2;3;LineString(100 50,50 20,50 40,50 30) 6 | -------------------------------------------------------------------------------- /example/traj_format.csv: -------------------------------------------------------------------------------- 1 | index;id;geom;timestamp 2 | 1;000;LineString(116.318417 39.984702,116.31845 39.984683,116.318417 39.984686,116.318385 39.984688,116.318263 39.984655,116.318026 39.984611,116.317761 39.984608,116.317517 39.984563,116.317294 39.984539,116.317065 39.984606);1224701584,1224701590,1224701595,1224701600,1224701605,1224701610,1224701615,1224701620,1224701625,1224701630 3 | 2;181;LineString(111.713166667 40.9166,111.713516667 40.91595,111.712933333 40.9156,111.712066667 40.9157666667,111.71145 40.9154333333,111.7105 40.9148666667,111.710333333 40.9142666667,111.710666667 40.9124666667,111.711316667 40.9115166667,111.711616667 40.9109333333);1205436723,1205436787,1205436866,1205437009,1205437084,1205437196,1205437277,1205437382,1205437408,1205437420 4 | -------------------------------------------------------------------------------- /example/traj_input.csv: -------------------------------------------------------------------------------- 1 | id;geom 2 | 1;LineString(20 50,40 50) 3 | 2;LineString(30 60,50 50) 4 | 3;LineString(10 50,40 50,30 50,90 50) 5 | 4;LineString(100 50,50 20,50 40,50 30) 6 | -------------------------------------------------------------------------------- /example/traj_timestamp.csv: -------------------------------------------------------------------------------- 1 | index;id;geom;ts;tend;timestamps 2 | 1;1;LineString(20 50,40 50);3;4;3,4 3 | 2;1;LineString(30 60,50 50);100;101;100,101 4 | 3;2;LineString(10 50,40 50,30 50,90 50);1;4;1,2,3,4 5 | 4;3;LineString(100 50,50 20,50 40,50 30);1;4;1,2,3,4 6 | -------------------------------------------------------------------------------- /gps2traj.cpp: -------------------------------------------------------------------------------- 1 | // Author: Can Yang 2 | // Email : cyang@kth.se 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | #include 15 | #include 16 | #include 17 | #include 18 | 19 | // Data types 20 | 21 | struct Point { 22 | double x; 23 | double y; 24 | double timestamp; 25 | }; 26 | 27 | double string2timestamp(const std::string &intermediate, 28 | int tag){ 29 | if (tag == 0){ 30 | // double value as timestamp 31 | return std::stod(intermediate); 32 | } else if (tag==1) { 33 | // 2020-01-01T00:00:27 34 | std::tm tm = {}; 35 | strptime(intermediate.c_str(), "%Y-%m-%dT%H:%M:%S", &tm); 36 | // std::cout<<"Debug "<< intermediate << "" << std::mktime(&tm) <<"\n"; 37 | return std::mktime(&tm); 38 | } 39 | return 0; 40 | }; 41 | 42 | double string2timestamp(const std::string &intermediate, 43 | const std::string &format){ 44 | if (format.empty()){ 45 | // double value as timestamp 46 | return std::stod(intermediate); 47 | } else { 48 | // 2020-01-01T00:00:27 49 | std::tm tm = {}; 50 | strptime(intermediate.c_str(),format.c_str(), &tm); 51 | // std::cout<<"Debug "<< intermediate << "" << std::mktime(&tm) <<"\n"; 52 | return std::mktime(&tm); 53 | } 54 | return 0; 55 | }; 56 | 57 | bool point_comp(Point &p1,Point &p2) { 58 | return (p1.timestamp geom; 64 | }; 65 | 66 | typedef std::unordered_map TrajIDMap; 67 | 68 | typedef std::vector DataStore; 69 | 70 | // struct OutputConfig { 71 | // bool write_timestamp=false; 72 | // bool write_index=true; 73 | // }; 74 | 75 | struct InputConfig { 76 | std::string id_name; 77 | std::string x_name; 78 | std::string y_name; 79 | std::string timestamp_name; 80 | int id_idx; 81 | int x_idx; 82 | int y_idx; 83 | int timestamp_idx; 84 | char delim; 85 | bool header; 86 | std::string time_format; 87 | }; 88 | 89 | struct OutputConfig { 90 | bool write_ts=false; 91 | bool write_tend=false; 92 | bool write_timestamp=false; 93 | }; 94 | 95 | 96 | 97 | void parse_ofields(OutputConfig &config, const std::string &str){ 98 | char delim = ','; 99 | std::unordered_set fields; 100 | std::stringstream ss(str); 101 | std::string intermediate; 102 | while (getline(ss, intermediate, delim)) { 103 | fields.insert(intermediate); 104 | } 105 | if (fields.find("ts") != fields.end()) { 106 | config.write_ts = true; 107 | } 108 | if (fields.find("tend") != fields.end()) { 109 | config.write_tend = true; 110 | } 111 | if (fields.find("timestamp") != fields.end()) { 112 | config.write_timestamp = true; 113 | } 114 | }; 115 | 116 | // Functions to manipulate trajectories 117 | 118 | // void append_gps_trajectory(Trajectory &traj, Point &p){ 119 | // traj.geom.push_back(p); 120 | // }; 121 | 122 | void append_point(DataStore &ds, TrajIDMap &id_map, std::string &traj_id, 123 | Point &p){ 124 | // Get the current index 125 | auto search = id_map.find(traj_id); 126 | // Search for id 127 | if (search != id_map.end()) { 128 | // A trajectory exists already 129 | int idx = search->second; 130 | ds[idx].geom.push_back(p); 131 | // ds[idx].geom.push_back(p); 132 | } else { 133 | // A new node is found, how to ensure that the ds is sorted 134 | int idx = id_map.size(); 135 | id_map.insert({traj_id, idx}); 136 | ds.push_back(Trajectory()); 137 | ds[idx].id = traj_id; 138 | ds[idx].geom.push_back(p); 139 | }; 140 | }; 141 | 142 | void read_header_config(InputConfig &config){ 143 | config.id_idx = std::atoi(config.id_name.c_str()); 144 | config.x_idx = std::atoi(config.x_name.c_str()); 145 | config.y_idx = std::atoi(config.y_name.c_str()); 146 | config.timestamp_idx = std::atoi(config.timestamp_name.c_str()); 147 | std::cout<<" Id index "<< config.id_idx<<"\n"; 148 | std::cout<<" X index "<< config.x_idx<<"\n"; 149 | std::cout<<" Y index "<< config.y_idx<<"\n"; 150 | std::cout<<" Timestamp index "<< config.timestamp_idx<<"\n"; 151 | }; 152 | 153 | void read_header_config(const std::string &row, InputConfig &config){ 154 | int i = 0; 155 | std::string intermediate; 156 | int id_idx = -1; 157 | int x_idx = -1; 158 | int y_idx = -1; 159 | int timestamp_idx = -1; 160 | std::stringstream check1(row); 161 | while (std::getline(check1, intermediate, config.delim)) { 162 | if (intermediate == config.id_name) { 163 | id_idx = i; 164 | } 165 | if (intermediate == config.x_name) { 166 | x_idx = i; 167 | } 168 | if (intermediate == config.y_name) { 169 | y_idx = i; 170 | } 171 | if (intermediate == config.timestamp_name) { 172 | timestamp_idx = i; 173 | } 174 | ++i; 175 | } 176 | if (id_idx < 0 || x_idx < 0 || y_idx < 0 || timestamp_idx<0) { 177 | if (id_idx < 0) { 178 | std::cout<<" Id column "<< config.id_name << "not found\n"; 179 | } 180 | if (x_idx < 0) { 181 | std::cout<<" X column "<< config.x_name << "not found\n"; 182 | } 183 | if (y_idx < 0) { 184 | std::cout<<" Y column "<< config.y_name << "not found\n"; 185 | } 186 | if (timestamp_idx < 0) { 187 | std::cout<<" Timestamp column "<< config.timestamp_name 188 | << "not found\n"; 189 | } 190 | std::exit(EXIT_FAILURE); 191 | } 192 | config.id_idx = id_idx; 193 | config.x_idx = x_idx; 194 | config.y_idx = y_idx; 195 | config.timestamp_idx = timestamp_idx; 196 | std::cout<<" Id index "<< id_idx<<"\n"; 197 | std::cout<<" X index "<< x_idx<<"\n"; 198 | std::cout<<" Y index "<< y_idx<<"\n"; 199 | std::cout<<" Timestamp index "<< timestamp_idx<<"\n"; 200 | }; 201 | 202 | void read_row_to_point(long long row_index, std::string &row, 203 | InputConfig &config, std::string &traj_id, Point &p){ 204 | // Parse fields from the input line 205 | std::stringstream ss(row); 206 | std::string intermediate; 207 | double x = 0, y = 0; 208 | double timestamp = 0; 209 | int index = 0; 210 | bool id_parsed = false; 211 | bool x_parsed = false; 212 | bool y_parsed = false; 213 | bool timestamp_parsed = false; 214 | // std::cout<<"Row "<< row << "\n"; 215 | while (std::getline(ss, intermediate, config.delim)) { 216 | if (index == config.id_idx) { 217 | traj_id = intermediate; 218 | id_parsed = true; 219 | } 220 | if (index == config.x_idx) { 221 | p.x = std::stod(intermediate); 222 | x_parsed = true; 223 | } 224 | if (index == config.y_idx) { 225 | p.y = std::stod(intermediate); 226 | y_parsed = true; 227 | } 228 | if (index == config.timestamp_idx) { 229 | // std::cout<<"Timestamp "<< intermediate << "\n"; 230 | p.timestamp = string2timestamp(intermediate,config.time_format); 231 | timestamp_parsed = true; 232 | } 233 | ++index; 234 | } 235 | if (!(id_parsed && x_parsed && y_parsed && timestamp_parsed)) { 236 | std::cout<<" Error in parsing row " << row_index << " "<< row << "\n"; 237 | std::exit(EXIT_FAILURE); 238 | } 239 | }; 240 | 241 | void read_traj_data(std::ifstream &ifs, InputConfig &config, 242 | DataStore &ds, TrajIDMap &id_map){ 243 | std::cout<<" Read gps data\n"; 244 | long long num_traj; 245 | long long num_point; 246 | std::string row; 247 | // skip header 248 | if (config.header){ 249 | std::getline(ifs, row); 250 | read_header_config(row, config); 251 | } else { 252 | read_header_config(config); 253 | } 254 | long long progress = 0; 255 | // Ensure that the data is sorted in ascending order by time 256 | while (std::getline(ifs, row)) { 257 | if (progress%1000000==0) { 258 | std::cout<<" Lines read " << progress << "\n"; 259 | } 260 | Point point; 261 | std::string traj_id; 262 | read_row_to_point(progress, row, config, traj_id, point); 263 | // std::cout << "Point timestamp "<< point.timestamp << "\n"; 264 | append_point(ds, id_map, traj_id, point); 265 | ++progress; 266 | } 267 | std::cout<<" Read gps data done with lines count "<time_gap || distance>dist_gap) { 342 | if (end_idx-1>start_idx){ 343 | num_traj+=1; 344 | num_point+=end_idx-start_idx; 345 | write_part_trip(ofs, config, num_traj, traj, start_idx, end_idx-1); 346 | } 347 | start_idx = end_idx; 348 | } 349 | } 350 | if (end_idx>start_idx){ 351 | num_traj+=1; 352 | num_point+=end_idx-start_idx+1; 353 | write_part_trip(ofs, config, num_traj, traj, start_idx, end_idx); 354 | } 355 | ++progress; 356 | } 357 | }; 358 | 359 | bool check_file_exist(const std::string &filename){ 360 | const char *filename_c_str = filename.c_str(); 361 | struct stat buf; 362 | if (stat(filename_c_str, &buf) != -1) { 363 | return true; 364 | } 365 | return false; 366 | }; 367 | 368 | void print_help(){ 369 | std::cout<<"Usage:\n"; 370 | std::cout<<"-i/--input: input gps file\n"; 371 | std::cout<<"-o/--output: output trajectory file\n"; 372 | std::cout<<"-d/--delim: delimiter character (, by default)\n"; 373 | std::cout<<"--id: id column name or index (id by default)\n"; 374 | std::cout<<"-x/--x: x column name or index (x by default)\n"; 375 | std::cout<<"-y/--y: y column name or index (y by default)\n"; 376 | std::cout<<"-t/--time: time column name or index (timestamp by default)\n"; 377 | std::cout<<"-f/--tf: time format(0 for int, 1 for 2020-01-01T00:00:27)\n"; 378 | std::cout<<"--time_gap: time gap to split long trajectory \n"; 379 | std::cout<<"--dist_gap: dist gap to split long trajectory \n"; 380 | std::cout<<"--no_header: if specified, gps file contains no header\n"; 381 | std::cout<<"--ofields: output fields (ts,tend,timestamp) separated by , default no output fields\n"; 382 | std::cout<<"-h/--help: print help information\n"; 383 | }; 384 | 385 | int main(int argc, char**argv){ 386 | std::cout<<"gps2traj\n"; 387 | if (argc==1){ 388 | print_help(); 389 | return 0; 390 | } 391 | std::string input_file; 392 | std::string output_file; 393 | std::string id_name = "id"; 394 | std::string x_name = "x"; 395 | std::string y_name = "y"; 396 | std::string timestamp_name = "timestamp"; 397 | std::string output_fields = ""; 398 | bool header = true; 399 | char delim = ','; 400 | int opt; 401 | double dist_gap=1e9; 402 | double time_gap=1e9; 403 | // int time_format = 0; 404 | std::string time_format=""; 405 | // The last element of the array has to be filled with zeros. 406 | static struct option long_options[] = 407 | { 408 | {"input", required_argument,0,'i' }, 409 | {"output", required_argument,0,'o' }, 410 | {"delim", required_argument,0,'d' }, 411 | {"id", required_argument,0, 0}, 412 | {"x", required_argument,0,'x' }, 413 | {"y", required_argument,0,'y' }, 414 | {"time", required_argument,0,'t' }, 415 | {"tf", required_argument,0, 'f'}, 416 | {"time_gap", required_argument,0, 0}, 417 | {"ofields", required_argument,0, 0}, 418 | {"dist_gap", required_argument,0, 0}, 419 | {"no_header", no_argument, 0, 0}, 420 | {"help", no_argument,0,'h' }, 421 | {0, 0, 0, 0 } 422 | }; 423 | int long_index =0; 424 | while ((opt = getopt_long(argc, argv,"i:o:d:0:t:x:y:f:h", 425 | long_options, &long_index )) != -1) 426 | { 427 | switch (opt) 428 | { 429 | case 'i': 430 | input_file = std::string(optarg); 431 | break; 432 | case 'o': 433 | output_file = std::string(optarg); 434 | break; 435 | case 'd': 436 | delim = *optarg; 437 | break; 438 | case 'x': 439 | x_name = std::string(optarg); 440 | break; 441 | case 'y': 442 | y_name = std::string(optarg); 443 | break; 444 | case 't': 445 | timestamp_name = std::string(optarg); 446 | break; 447 | case 'f': 448 | time_format = std::string(optarg); 449 | break; 450 | case 'h': 451 | print_help(); 452 | std::exit(EXIT_SUCCESS); 453 | case 0: 454 | if (strcmp(long_options[long_index].name,"id")==0){ 455 | id_name = std::string(optarg); 456 | } 457 | // printf("option %s", long_options[long_index].name); 458 | if (strcmp(long_options[long_index].name,"time_gap")==0){ 459 | time_gap = std::atof(optarg); 460 | } 461 | if (strcmp(long_options[long_index].name,"dist_gap")==0){ 462 | dist_gap = std::atof(optarg); 463 | } 464 | if (strcmp(long_options[long_index].name,"no_header")==0){ 465 | header = false; 466 | } 467 | if (strcmp(long_options[long_index].name,"ofields")==0){ 468 | output_fields = std::string(optarg); 469 | } 470 | break; 471 | default: 472 | print_help(); 473 | exit(EXIT_FAILURE); 474 | } 475 | } 476 | if (!check_file_exist(input_file)) 477 | { 478 | std::cout<<" Error: Input file not found: "<< input_file <<"\n"; 479 | std::exit(EXIT_FAILURE); 480 | } 481 | std::cout<<"---- Configurations ----\n"; 482 | std::cout<<" Input file: "<( 509 | t2 - t1 ).count(); 510 | std::cout<<"Reading input takes " << input_duration << " ms\n"; 511 | std::cout<<"---- Sorting points in trajectory ----\n"; 512 | sort_data_store(ds); 513 | auto t3 = std::chrono::high_resolution_clock::now(); 514 | auto sort_duration = std::chrono::duration_cast( 515 | t3 - t2 ).count(); 516 | std::cout<<"Sorting points takes " << sort_duration << " ms\n"; 517 | std::cout<<"---- Writing trajectory data ----\n"; 518 | std::ofstream ofs(output_file); 519 | ofs.precision(12); 520 | write_traj_data(ofs, output_config, ds, id_map, time_gap, dist_gap, 521 | num_traj, num_point); 522 | auto t4 = std::chrono::high_resolution_clock::now(); 523 | auto write_duration = std::chrono::duration_cast( 524 | t4 - t3 ).count(); 525 | std::cout<<"Write output takes " << write_duration << " ms\n"; 526 | std::cout<<"---- gps2traj statistcs ----\n"; 527 | std::cout<<" Distinct ids "<< id_map.size() <<"\n"; 528 | std::cout<<" Number of trips "<< num_traj <<"\n"; 529 | std::cout<<" Number of points "<< num_point <<"\n"; 530 | auto t5 = std::chrono::high_resolution_clock::now(); 531 | auto whole_duration = std::chrono::duration_cast( 532 | t5 - t1 ).count(); 533 | std::cout<<"gps2traj finish in " << whole_duration <<" ms \n"; 534 | }; 535 | -------------------------------------------------------------------------------- /traj2gps.cpp: -------------------------------------------------------------------------------- 1 | // Author: Can Yang 2 | // Email : cyang@kth.se 3 | 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | #include 15 | #include 16 | #include 17 | #include 18 | 19 | // Data types 20 | 21 | struct Point { 22 | double x; 23 | double y; 24 | }; 25 | 26 | struct Trajectory { 27 | std::string id; 28 | std::vector points; 29 | }; 30 | 31 | struct InputConfig { 32 | std::string id_name="id"; 33 | int id_idx; 34 | std::string geom_name="geom"; 35 | int geom_idx; 36 | char delim = ';'; 37 | bool header = true; 38 | std::string input_file; 39 | std::string output_file; 40 | }; 41 | 42 | void read_header_config(InputConfig &config){ 43 | config.id_idx = std::atoi(config.id_name.c_str()); 44 | config.geom_idx = std::atoi(config.geom_name.c_str()); 45 | std::cout<<" Id index "<< config.id_idx<<"\n"; 46 | std::cout<<" Geom index "<< config.geom_idx<<"\n"; 47 | }; 48 | 49 | void read_header_config(const std::string &row, InputConfig &config){ 50 | int i = 0; 51 | std::string intermediate; 52 | int id_idx = -1; 53 | int geom_idx = -1; 54 | std::stringstream check1(row); 55 | while (std::getline(check1, intermediate, config.delim)) { 56 | if (intermediate == config.id_name) { 57 | id_idx = i; 58 | } 59 | if (intermediate == config.geom_name) { 60 | geom_idx = i; 61 | } 62 | ++i; 63 | } 64 | if (id_idx < 0 || geom_idx < 0) { 65 | if (id_idx < 0) { 66 | std::cout<<" Id column "<< config.id_name << "not found\n"; 67 | } 68 | if (geom_idx < 0) { 69 | std::cout<<" geom column "<< config.geom_name << "not found\n"; 70 | } 71 | std::exit(EXIT_FAILURE); 72 | } 73 | config.id_idx = id_idx; 74 | config.geom_idx = geom_idx; 75 | std::cout<<" Id index "<< id_idx<<"\n"; 76 | std::cout<<" Geom index "<< geom_idx<<"\n"; 77 | }; 78 | 79 | bool iequals(const std::string& a, const std::string& b) 80 | { 81 | unsigned int sz = a.size(); 82 | if (b.size() != sz) 83 | return false; 84 | for (unsigned int i = 0; i < sz; ++i) 85 | if (tolower(a[i]) != tolower(b[i])) 86 | return false; 87 | return true; 88 | }; 89 | 90 | std::vector wkt2traj(const std::string &str){ 91 | std::vector pts; 92 | std::stringstream stringStream(str); 93 | std::string line; 94 | std::vector tokens; 95 | while(std::getline(stringStream, line)) 96 | { 97 | std::size_t prev = 0, pos; 98 | while ((pos = line.find_first_of("() ,", prev)) != std::string::npos) 99 | { 100 | if (pos > prev) 101 | tokens.push_back(line.substr(prev, pos-prev)); 102 | prev = pos+1; 103 | } 104 | if (prev < line.length()) 105 | tokens.push_back(line.substr(prev, std::string::npos)); 106 | } 107 | // Iterate to parse data 108 | auto iter = tokens.begin(); 109 | if (iter != tokens.end() && 110 | iequals(*iter, "LINESTRING")) 111 | { 112 | ++iter; 113 | } 114 | else 115 | { 116 | std::cout<<"Error, geom field should start with LINESTRING "<( 301 | t2 - t1 ).count(); 302 | std::cout<<"traj2gps finish in " << whole_duration <<" ms \n"; 303 | }; 304 | --------------------------------------------------------------------------------