Public
Authored by Vipul

Optimization - C++

Compilation :

  • Command :

    g++ -O2 -std=c++14 -Wall -Wextra -pedantic -Wformat=2 -Wfloat-equal -Wlogical-op -Wredundant-decls -Wconversion -Wcast-qual -Wcast-align -Wuseless-cast -Wno-shadow -Wno-unused-result -Wno-unused-parameter -Wno-unused-local-typedefs -Wno-long-long -DLOCAL_PROJECT -g -DLOCAL_DEBUG -D_GLIBCXX_DEBUG -D_GLIBCXX_DEBUG_PEDANTIC -fsanitize=address,undefined   
    g++ -O2 -std=c++14 -Wall -Wextra -pedantic -Wshadow -Wformat=2 -Wfloat-equal -Wconversion -Wlogical-op -Wshift-overflow=2 -Wduplicated-cond -Wcast-qual -Wcast-align -D_GLIBCXX_DEBUG -D_GLIBCXX_DEBUG_PEDANTIC -D_FORTIFY_SOURCE=2 -fsanitize=address,undefined -fno-sanitize-recover -fstack-protector 

1codeforces.com/blog/entry/49449
2codeforces.com/blog/entry/15547

Debugging

  • Optimization[5] [use -O2 optimization level(default is -O0)]
  • -fsanitize=address,memory,undefined[6][11]
  • Warnings[7]
  • GCC debugging option[8]
  • Clang Manual[12]

Pragmas in gcc compiler[4][13] :

  • #pragma GCC optimize ("Ofast") :- Will make GCC auto vectorized for loops and optimizes floating points better (assumes associativity and turns off denormals).

  • #pragma GCC target ("avx,avx2") :- Can double performance of vectorized code ,but causes crashes on old machines.

  • #pragma GCC optimize ("trapv") :- Kills the program on integer overflows (but is really slow and try to avoid it).(use only for purpose of local testing)

  • #pragma GCC optimize "O3"

  • #pragma GCC optimize "unroll-loops,omit-frame-pointer"

#pragma GCC target("sse,sse2,sse3,sse4,ssse3,popcnt,abm,mmx,avx,tune=native")

  • avx : Advanced vectorized extension
  • mmx : SIMD instruction set (matrix math extensions)
  • sse /2/3 : Streaming SIMD extension
  • ssse3 : Simulating streaming SIMD extension (upgrade from SSE3)
  • sse4 : major upgrade to SSSE3
  • popcnt : Population count(count number of bits set to 1)
  • lzcnt : Leading zero count
  • abm : Advanced bit manipulation (popcnt+lzcnt) by AMD
  • tune=native : tune for native architecture

Note : In general GCC do not recommend the use of pragmas; See Function Attributes, for further explanation.

Below lines works with MSVC++,Clang++ (Ref : codeforces) [1, 2]

  • #pragma comment(linker, "/stack:227420978") :- It sets stack size. If you don't write it, your solution may crash with stack overflow (in deep recursive functions, for example).

Note: Change stack size in GNU GCC[1]

  • Linux command
    ulimit -a # show stack size
    ulimit -s 32768 # sets the stack size to 32M bytes
  • GCC command
    • -Wl,--stack=268435456[3](not working with GCC).
    • Use option -fno-stack-limit while compilation
    • use this one in src[9,10]
#include <sys/resource.h>
int main() {
  const rlim_t val = 268435456;
  struct rlimit lim = {.rlim_cur = val, .rlim_max = val};
  if (setrlimit(RLIMIT_STACK, &lim)) {
    printf("ERROR!\n");
  }
}

Source Code

  • ios_base::sync_with_stdio(false) :
  • cin.tie(nullptr) :
  • cin.exceptions(cin.failbit) : Logical error on I/O operation[1]

Predefined Macros[MSDN]

  • __cplusplus
  • __PRETTY_FUNCTION__
  • __LINE__

Ref:
1. https://cs.nyu.edu/exact/core/doc/stackOverflow.txt[Correction: you can change size of stack in linux by using passing -Wl,--stack=size through linker] - supported by Clang LLVM
2. https://stackoverflow.com/questions/20825372/pragma-commentlinker-stack16777216
3. https://codeforces.com/blog/entry/57646
4. https://gcc.gnu.org/onlinedocs/gcc/Function-Specific-Option-Pragmas.html#Function-Specific-Option-Pragmas
5. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
6. https://clang.llvm.org/docs/UsersManual.html#controlling-code-generation
7. https://codeforces.com/blog/entry/15547
8. https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html
9. http://man7.org/linux/man-pages/man2/setrlimit.2.html
10. https://codeforces.com/blog/entry/79?#comment-436209
11. https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html
12. https://clang.llvm.org/docs/UsersManual.html
13. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html

Resource:

C++ Documentation:

Books/Manuals
1.Intel® 64 and IA-32 Architectures Optimization Reference Manual
2.Intel® 64 and IA-32 Architectures Software Developer’s Manual
3.AMD developers resource

Content of this snippet is under CC 4.0 license
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Edited
optimized.cpp 987 Bytes
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment