NATSORTFILES Examples
The function NATSORTFILES sorts a cell array of filenames or filepaths, taking into account any number values within the strings. This is known as a "natural order sort" or an "alphanumeric sort". Note that MATLAB's inbuilt SORT function sorts the character codes only (as does sort in most programming languages).
NATSORTFILES is not a naive natural-order sort, but sorts the filenames and file extensions separately: this prevents the file extension separator character . and file extension itself from influencing the sort order of the complete filename+extension. Thus NATSORTFILES sorts shorter filenames before longer ones, which is known as a "dictionary sort". For the same reason filepaths are split at each path-separator character, and each directory level is sorted separately. See the "Explanation" sections below for more details.
For sorting the rows of a cell array of strings use NATSORTROWS.
For sorting a cell array of strings use NATSORT.
Contents
Basic Usage:
By default NATSORTFILES interprets consecutive digits as being part of a single integer, each number is considered to be as wide as one letter:
A = {'a2.txt', 'a10.txt', 'a1.txt'};
66 | sort(A)
67 | natsortfiles(A)
68 | ans = 69 | 'a1.txt' 'a10.txt' 'a2.txt' 70 | ans = 71 | 'a1.txt' 'a2.txt' 'a10.txt' 72 |
Output 2: Sort Index
The second output argument is a numeric array of the sort indices ndx, such that Y = X(ndx) where Y = natsortfiles(X):
[~,ndx] = natsortfiles(A) 73 |
ndx = 74 | 3 1 2 75 |
Example with DIR and a Cell Array
One common situation is using DIR to identify files in a folder, sort them into the correct order, and then loop over them: below is an example of how to do this. Remember to preallocate all output arrays before the loop!
D = 'natsortfiles_test'; % directory path 76 | S = dir(fullfile(D,'*.txt')); % get list of files in directory 77 | N = natsortfiles({S.name}); % sort file names into order 78 | for k = 1:numel(N) 79 | fullfile(D,N{k}) 80 | end 81 |
ans = 82 | natsortfiles_test\A_1.txt 83 | ans = 84 | natsortfiles_test\A_1-new.txt 85 | ans = 86 | natsortfiles_test\A_1_new.txt 87 | ans = 88 | natsortfiles_test\A_2.txt 89 | ans = 90 | natsortfiles_test\A_3.txt 91 | ans = 92 | natsortfiles_test\A_10.txt 93 | ans = 94 | natsortfiles_test\A_100.txt 95 | ans = 96 | natsortfiles_test\A_200.txt 97 |
Example with DIR and a Structure
Users who need to access the DIR structure fields can use NATSORTFILE's second output to sort DIR's output structure into the correct order:
D = 'natsortfiles_test'; % directory path 98 | S = dir(fullfile(D,'*.txt')); % get list of files in directory 99 | [~,ndx] = natsortfiles({S.name}); % indices of correct order 100 | S = S(ndx); % sort structure using indices 101 | for k = 1:numel(N) 102 | S(k).name; 103 | S(k).date; 104 | end 105 |
Explanation: Dictionary Sort
Filenames and file extensions are separated by the extension separator, the period character ., which gets sorted after all of the characters from 0 to 45, including !"#$%&'()*+,-, the space character, and all of the control characters (newlines, tabs, etc). This means that a naive sort or natural-order sort will sort some short filenames after longer filenames. In order to provide the correct dictionary sort, with shorter filenames first, NATSORTFILES sorts the filenames and file extensions separately:
B = {'test_ccc.m'; 'test-aaa.m'; 'test.m'; 'test.bbb.m'};
106 | sort(B) % '-' sorts before '.'
107 | natsort(B) % '-' sorts before '.'
108 | natsortfiles(B) % correct dictionary sort
109 | ans = 110 | 'test-aaa.m' 111 | 'test.bbb.m' 112 | 'test.m' 113 | 'test_ccc.m' 114 | ans = 115 | 'test-aaa.m' 116 | 'test.bbb.m' 117 | 'test.m' 118 | 'test_ccc.m' 119 | ans = 120 | 'test.m' 121 | 'test-aaa.m' 122 | 'test.bbb.m' 123 | 'test_ccc.m' 124 |
Explanation: Filenames
NATSORTFILES combines a dictionary sort with a natural-order sort, so that the number values within the filenames are taken into consideration:
C = {'test2.m'; 'test10-old.m'; 'test.m'; 'test10.m'; 'test1.m'};
125 | sort(C) % Wrong numeric order.
126 | natsort(C) % Correct numeric order, but longer before shorter.
127 | natsortfiles(C) % Correct numeric order and dictionary sort.
128 | ans = 129 | 'test.m' 130 | 'test1.m' 131 | 'test10-old.m' 132 | 'test10.m' 133 | 'test2.m' 134 | ans = 135 | 'test.m' 136 | 'test1.m' 137 | 'test2.m' 138 | 'test10-old.m' 139 | 'test10.m' 140 | ans = 141 | 'test.m' 142 | 'test1.m' 143 | 'test2.m' 144 | 'test10.m' 145 | 'test10-old.m' 146 |
Explanation: Filepaths
For the same reason, filepaths are split at each file path separator character (both / and \ are considered to be file path separators) and every level of directory names are sorted separately. This ensures that the directory names are sorted with a dictionary sort and that any numbers are taken into consideration:
D = {'A2-old\test.m';'A10\test.m';'A2\test.m';'AXarchive.zip';'A1\test.m'};
147 | sort(D) % Wrong numeric order, and '-' sorts before '\':
148 | natsort(D) % correct numeric order, but longer before shorter.
149 | natsortfiles(D) % correct numeric order and dictionary sort.
150 | ans = 151 | 'A10\test.m' 152 | 'A1\test.m' 153 | 'A2-old\test.m' 154 | 'A2\test.m' 155 | 'AXarchive.zip' 156 | ans = 157 | 'A1\test.m' 158 | 'A2-old\test.m' 159 | 'A2\test.m' 160 | 'A10\test.m' 161 | 'AXarchive.zip' 162 | ans = 163 | 'AXarchive.zip' 164 | 'A1\test.m' 165 | 'A2\test.m' 166 | 'A2-old\test.m' 167 | 'A10\test.m' 168 |
Regular Expression: Decimal Numbers, E-notation, +/- Sign.
NATSORTFILES is a wrapper for NATSORT, which means all of NATSORT's options are also supported. In particular the number recognition can be customized to detect numbers with decimal digits, E-notation, a +/- sign, or other specific features. This detection is defined by providing an appropriate regular expression: see NATSORT for details and examples.
E = {'test24.csv','test1.8.csv','test5.csv','test3.3.csv','test12.csv'};
169 | natsortfiles(E,'\d+(\.\d+)?')
170 | ans = 171 | 'test1.8.csv' 'test3.3.csv' 'test5.csv' 'test12.csv' 'test24.csv' 172 |