members() overflow management is KO after stacksize() removal. By the way, members() becomes very slow for big input arrays
Reported by Samuel GOUGEON (@sgougeon)
BUG DESCRIPTION:
----------------
members() overflow management is KO after stacksize() removal. By the way, members() becomes very slow for big input arrays
When stacksize() has been removed @ https://codereview.scilab.org/#/c/16791,
the members() overflow management has been completely canceled instead of being updated.
It is now KO, which yields error or even makes the computer running out of memory and crashes:
With a big array of text:
--> exec('functions_stats.sce', -1)
at line 112 of function repmat ( SCI\modules\elementary_functions\macros\repmat.sci line 124 )
at line 296 of function members ( SCI\modules\elementary_functions\macros\members.sci line 309 )
at line 14 of function %c_dsearch ( SCI\modules\overloading\macros\%c_dsearch.sci line 26 )
in builtin dsearch
at line 402 of function histc ( SCI\modules\statistics\macros\histc.sci line 413 )
at line 78 of executed file functions_stats.sce
Can not allocate 280.7 GB memory.
Some examples make the computer crashing (not only the Scilab session).
By the way, members(N,H) becomes very slow for big N or/H arrays.
Profiling the code shows that the bottleneck is the calls to repmat(), that far dominate the total execution time.
Replacing the current algorithm with a simple for loop over Needle's elements speeds up members() by a factor ~ 100.
ERROR LOG:
----------
HOW TO REPRODUCE THE BUG:
-------------------------
n = 4e4;
// Build n random strings of 5 characters "a" to "e":
t = strsplit(ascii(grand(n,5,"uin",97,101)),5:5:5*n-1);
u = unique(t);
tic();
[nb, loc] = members(u,t);
disp(toc())