Skip to content

members() overflow management is KO after stacksize() removal. By the way, members() becomes very slow for big input arrays

Reported by Samuel GOUGEON (@sgougeon)

BUG DESCRIPTION:
----------------
members() overflow management is KO after stacksize() removal. By the way, members() becomes very slow for big input arrays

When stacksize() has been removed @ https://codereview.scilab.org/#/c/16791,
the members() overflow management has been completely canceled instead of being updated.
It is now KO, which yields error or even makes the computer running out of memory and crashes:

With a big array of text:

--> exec('functions_stats.sce', -1)
at line   112 of function repmat     ( SCI\modules\elementary_functions\macros\repmat.sci line 124 )
at line   296 of function members    ( SCI\modules\elementary_functions\macros\members.sci line 309 )
at line    14 of function %c_dsearch ( SCI\modules\overloading\macros\%c_dsearch.sci line 26 )
in builtin                dsearch    
at line   402 of function histc      ( SCI\modules\statistics\macros\histc.sci line 413 )
at line    78 of executed file functions_stats.sce

Can not allocate 280.7 GB memory.

Some examples make the computer crashing (not only the Scilab session).

By the way, members(N,H) becomes very slow for big N or/H arrays.
Profiling the code shows that the bottleneck is the calls to repmat(), that far dominate the total execution time.
Replacing the current algorithm with a simple for loop over Needle's elements speeds up members() by a factor ~ 100.


ERROR LOG:
----------


HOW TO REPRODUCE THE BUG:
-------------------------
n = 4e4;
// Build n random strings of 5 characters "a" to "e":
t = strsplit(ascii(grand(n,5,"uin",97,101)),5:5:5*n-1);
u = unique(t);

tic();
[nb, loc] = members(u,t);
disp(toc())