Commit bb564d05 authored by sydney's avatar sydney

Simple decoder MS

parent 0139c68d
Subproject commit 76d350ae2a3fb157635e2d2163e44ebe6d1b76c1
# Fast_LDPC_decoder_for_x86
This is the source codes of the fast x86 LDPC decoders whose description and
optimization techniques are published in the IEEE TDPS journal (article is
not yet accepted, in major revision state).
In this git repository we published the source code of a LDPC decoder
implementation optimized for x86 target. This LDPC decoder implementation
efficiently takes advantage of the SIMD and SPMT programming model. The
approach used to achieve very high
Originally, this source code piece is part of a much larger project
that enables us to experiment LDPC decoding algorithm, data format, etc.
It explains why some piece of code are useless for you ;-)
In order to compile the LDPC decoder, you currently have to use Intel C++
compiler. Indeed, the ANWG channel model is implemented using the MKL library
function.
Source code compilation:
########################
UNTIL NOW, THIS CODE WAS COMPILED ONLY MACOS WITH INTEL COMPILER (ICC). IF YOU
GET ISSUES ON OTHER PLATFORMS, PLEASE CONTACT ME THROUGH THE FORUM
OR BY E-MAIL (bertrand.legal@ims-bordeaux.fr)
In order to compile the LDPC decoders, just open a terminal and go in
the "bin" directory.
> cd source_path
> cd bin
Then compile the source codes using "make"
> make
The output must look like this:
[C++] ../src/CBitGenerator/CBitGenerator.cpp
[C++] ../src/CChanel/CChanel.cpp
[C++] ../src/CChanel/CChanelAWGN_MKL.cpp
[C++] ../src/CDecoder/template/CDecoder.cpp
[C++] ../src/CDecoder/template/CDecoder_fixed.cpp
[C++] ../src/CDecoder/template/CDecoder_fixed_AVX.cpp
[C++] ../src/CDecoder/template/CDecoder_fixed_SSE.cpp
[C++] ../src/CDecoder/OMS/CDecoder_OMS_fixed_SSE.cpp
[C++] ../src/CDecoder/OMS/CDecoder_OMS_fixed_AVX.cpp
[C++] ../src/CDecoder/NMS/CDecoder_NMS_fixed_SSE.cpp
[C++] ../src/CDecoder/NMS/CDecoder_NMS_fixed_AVX.cpp
[C++] ../src/CEncoder/CFakeEncoder.cpp
[C++] ../src/CEncoder/Encoder.cpp
[C++] ../src/CEncoder/GenericEncoder.cpp
[C++] ../src/CErrorAnalyzer/CErrorAnalyzer.cpp
[C++] ../src/CFixPointConversion/CFastFixConversion.cpp
[C++] ../src/CFixPointConversion/CFixConversion.cpp
[C++] ../src/CTerminal/CTerminal.cpp
[C++] ../src/CTimer/CTimer.cpp
[C++] ../src/CTools/CTools.cpp
[C++] ../src/CTools/transpose_avx.cpp
[C++] ../src/CTrame/CTrame.cpp
[C++] ../src/main_p.cpp
[LINKING] main.icc
The compilation of the 576x288 LDPC decoder (default configuration) was successful,
the executable file is named "main.icc". To launch the LDPC decoder compiled
(576x288), just execute "main.icc" with some parameters:
OMS decoder (offset = 1/8)
> ./main.icc -fixed -avx -OMS 1 -min 0.5 -max 4.0 -iter 20
> ./main.icc -fixed -avx -OMS 1 -min 0.5 -max 4.0 -iter 20
NMS decoder (factor = 29/32)
> ./main.icc -fixed -avx -NMS 29 -min 0.5 -max 4.0 -iter 20
> ./main.icc -fixed -avx -OMS 1 -min 0.5 -max 4.0 -iter 20
Air throughput measure:
#######################
To measure the throughput performances, you have to use:
> ./main.icc -fixed -sse -NMS 29 -iter 20 -NMS 29 -fer 10000000 -timer 10 -thread 1
+> timer 10 : measure and average on 10 seconds,
+> thread 1 : use one processor core only (1, 2, 4 values are supported),
The result looks like this:
> (PERF) H. LAYERED 16 fixed, 576x288 LDPC code, 20 its, 1 threads
> (PERF) Kernel Execution time = 1638690 us for 229376 frames => 80.626 Mbps
> (PERF) SNR = 0.50, ITERS = 20, THROUGHPUT = 80.626 Mbps
> (PERF) LDPC decoder air throughput = 80.626 Mbps
> (II) THE SIMULATION HAS STOP DUE TO THE (USER) TIME CONTRAINT.
> (PERF1) LDPC decoder air throughput = 84.699 Mbps
[PERF] provides the throughput value when the execution time is measured in the
simulation loop. [PERF1] provides the throughput of the decoder outside the
simulation loop.
Extending the evaluation:
#########################
To compile more LDPC decoders (code length, etc), execute
the build.py script from the bin directory:
> ../scripts/build.py
CC=g++
CFLAGS= -g -I../cpp_src -W -Wall -O0 \
-fopenmp -finline -funroll-loops -opt-prefetch -unroll-aggressive \
-m64 -DMKL_ILP64 -I../src -msse4a -march=native -I${MKL_ROOT}/include/
LDFLAGS=-Wl,--start-group \
${MKL_ROOT}/lib/intel64/libmkl_intel_ilp64.a \
${MKL_ROOT}/lib/intel64/libmkl_gnu_thread.a \
${MKL_ROOT}/lib/intel64/libmkl_core.a \
-Wl,--end-group \
-lgomp -lpthread -lm -ldl -fopenmp
EXEC=main.icc
#CFLAGS=-Wa,-q -I../cpp_src -W -Wall -ansi -pedantic -O3 -Wall -funroll-loops -ftree-vectorize \
# -msse4a -march=native -mtune=native -ffast-math -fopenmp \
# -fstrict-aliasing -fprefetch-loop-arrays -I../src
SRC= \
../src/CBitGenerator/CBitGenerator.cpp \
../src/CChanel/CChanel.cpp \
../src/CChanel/CChanelAWGN_MKL.cpp \
../src/CDecoder/template/CDecoder.cpp \
../src/CDecoder/template/CDecoder_fixed.cpp \
../src/CDecoder/template/CDecoder_fixed_SSE.cpp \
../src/CDecoder/template/CDecoder_fixed_reds.cpp \
../src/CDecoder/OMS/CDecoder_OMS_fixed_SSE.cpp \
../src/CDecoder/MS/CDecoder_MS_fixed_reds.cpp \
../src/CDecoder/NMS/CDecoder_NMS_fixed_SSE.cpp \
../src/CEncoder/CFakeEncoder.cpp \
../src/CEncoder/Encoder.cpp \
../src/CEncoder/GenericEncoder.cpp \
../src/CErrorAnalyzer/CErrorAnalyzer.cpp \
../src/CFixPointConversion/CFastFixConversion.cpp \
../src/CFixPointConversion/CFixConversion.cpp \
../src/CTerminal/CTerminal.cpp \
../src/CTimer/CTimer.cpp \
../src/CTools/CTools.cpp \
../src/CTrame/CTrame.cpp \
../src/main_p.cpp \
../src/CDecoder/template/CDecoder_fixed_AVX.cpp \
../src/CDecoder/OMS/CDecoder_OMS_fixed_AVX.cpp \
../src/CDecoder/NMS/CDecoder_NMS_fixed_AVX.cpp \
../src/CTools/transpose_avx.cpp
OBJ= $(SRC:.cpp=.o)
all: $(EXEC)
main.icc: $(OBJ)
@echo "[LINKING] $@"
@$(CC) -o $@ $^ $(LDFLAGS)
%.o: %.cpp
@echo "[C++] $< $@"
@$(CC) $(CFLAGS) -o $@ -c $<
.PHONY: clean mrproper
clean:
find ../src/ -name "*.o" -exec rm {} \;
find . -name "*.ic*" -exec rm {} \;
mrproper: clean
rm $(EXEC)
CC=icc
CFLAGS=-I../cpp_src -W -Wall -O3 -march=native -fast -ansi-alias \
-fopenmp -finline -funroll-loops -no-prec-div -opt-prefetch -unroll-aggressive \
-m64 -auto-ilp32 -I../src -xCORE-AVX2 -fma -mkl -I/opt/local/include/
LDFLAGS=-L/opt/local/lib -fopenmp -lboost_system-mt -lboost_timer-mt -mkl -lPcmMsr
EXEC=main.icc
#CFLAGS=-Wa,-q -I../cpp_src -W -Wall -ansi -pedantic -O3 -Wall -funroll-loops -ftree-vectorize \
# -msse4a -march=native -mtune=native -ffast-math -fopenmp \
# -fstrict-aliasing -fprefetch-loop-arrays -I../src
SRC= \
../src/CBitGenerator/CBitGenerator.cpp \
../src/CChanel/CChanel.cpp \
../src/CChanel/CChanelAWGN_MKL.cpp \
../src/CDecoder/template/CDecoder.cpp \
../src/CDecoder/template/CDecoder_fixed.cpp \
../src/CDecoder/template/CDecoder_fixed_AVX.cpp \
../src/CDecoder/template/CDecoder_fixed_SSE.cpp \
../src/CDecoder/OMS/CDecoder_OMS_fixed_SSE.cpp \
../src/CDecoder/OMS/CDecoder_OMS_fixed_AVX.cpp \
../src/CDecoder/NMS/CDecoder_NMS_fixed_SSE.cpp \
../src/CDecoder/NMS/CDecoder_NMS_fixed_AVX.cpp \
../src/CEncoder/CFakeEncoder.cpp \
../src/CEncoder/Encoder.cpp \
../src/CEncoder/GenericEncoder.cpp \
../src/CErrorAnalyzer/CErrorAnalyzer.cpp \
../src/CFixPointConversion/CFastFixConversion.cpp \
../src/CFixPointConversion/CFixConversion.cpp \
../src/CTerminal/CTerminal.cpp \
../src/CTimer/CTimer.cpp \
../src/CTools/CTools.cpp \
../src/CTools/transpose_avx.cpp \
../src/CTrame/CTrame.cpp \
../src/main_p.cpp
OBJ= $(SRC:.cpp=.o)
all: $(EXEC)
main.icc: $(OBJ)
@echo "[LINKING] $@"
@$(CC) $(CFLAGS) -o $@ $^ $(LDFLAGS)
%.o: %.cpp
@echo "[C++] $<"
@$(CC) $(CFLAGS) -o $@ -c $< $(CFLAGS)
.PHONY: clean mrproper
clean:
find ../src/ -name "*.o" -exec rm {} \;
find . -name "*.ic*" -exec rm {} \;
mrproper: clean
rm $(EXEC)
CODE 64800x6480.dvb-s2
CODE 64800x7200.dvb-s2
CODE 64800x32400.dvb-s2
#!/usr/bin/python
import subprocess
import os
C_LIST = []
for root, dirs, files in os.walk("../src/Constantes/"):
for code in dirs:
C_LIST.append( code )
break
#C_LIST=[
# '155x93', '200x100', '816x408', '1024x518', '1056x528', '1200x600',
# '1248x624', '2640x1320',
# '4000x2000', '4896x2448', '8000x4000', '9972x4986', '20000x10000',
# '2388x597',
# 802.11e
# '576x288', '960x480', '2304x1152',
# 802.11n
# '1944x972', '1944x648', '1944x486',
# 802.11an
# '2048x384',
# Codes DVB-NGH
# '9216x4608', '9216x2304',
# Codes DVB-NGH
# '16200x9000', '16200x7560', '16200x6480', '16200x5400', '16200x4320', '16200x2880',
# Codes DVB-S2
# '64800x6480', '64800x7200', '64800x10800', '64800x16200', '64800x21600', '64800x32400',
# '4000x2000']
C_LIST=['64800x6480.dvb-s2', '64800x7200.dvb-s2', '64800x32400.dvb-s2']
GCC_LIST=['']
subprocess.call('rm main.icc main.icc.*',shell=True)
for C in C_LIST:
subprocess.call('echo "CODE ' + C + '" >> results_article',shell=True)
subprocess.call('echo "#include \\"./' + C + '/constantes.h\\"" > ../src/Constantes/constantes.h',shell=True)
subprocess.call('echo "#include \\"./' + C + '/constantes_sse.h\\"" > ../src/Constantes/constantes_sse.h',shell=True)
for COMPILER in GCC_LIST:
subprocess.call('echo "COMPILATION OF LDPC DECODER FOR CODE [' + C + ']"',shell=True)
output = subprocess.check_output('make -f Makefile' + COMPILER + ' clean',shell=True)
output = subprocess.check_output('make -f Makefile' + COMPILER + ' -j 8',shell=True)
output = subprocess.check_output('mv main.icc main.icc.' + C + '',shell=True)
#check_call
#!/usr/bin/python
import subprocess
import os
C_LIST = []
for root, dirs, files in os.walk("../src/Constantes/"):
for code in dirs:
C_LIST.append( code )
break
C_LIST=[
'576x288', '1944x972', '2048x384', '2304x1152', '4000x2000'
]
GCC_LIST=['.icc']
for C in C_LIST:
for COMPILER in GCC_LIST:
subprocess.call('echo "EXECUTION OF LDPC DECODER FOR CODE [' + C + ']"',shell=True)
subprocess.call('echo "xMS - 10 its - sse core"',shell=True)
subprocess.call('./main.icc.' + C + ' -fixed -sse -OMS 1 -fer 10000000 -min 0.50 -max 0.51 -iter 10 -timer 30 -thread 1 | grep "(PERF1) Total Kernel throughput"',shell=True)
subprocess.call('./main.icc.' + C + ' -fixed -sse -NMS 29 -fer 10000000 -min 0.50 -max 0.51 -iter 10 -timer 30 -thread 1 | grep "(PERF1) Total Kernel throughput"',shell=True)
subprocess.call('echo "xMS - 10 its - avx cores"',shell=True)
subprocess.call('./main.icc.' + C + ' -fixed -avx -OMS 1 -fer 10000000 -min 0.50 -max 0.51 -iter 10 -timer 30 -thread 1 | grep "(PERF1) Total Kernel throughput"',shell=True)
subprocess.call('./main.icc.' + C + ' -fixed -avx -NMS 29 -fer 10000000 -min 0.50 -max 0.51 -iter 10 -timer 30 -thread 1 | grep "(PERF1) Total Kernel throughput"',shell=True)
subprocess.call('echo "xMS - 20 its - sse core"',shell=True)
subprocess.call('./main.icc.' + C + ' -fixed -sse -OMS 1 -fer 10000000 -min 0.50 -max 0.51 -iter 20 -timer 30 -thread 1 | grep "(PERF1) Total Kernel throughput"',shell=True)
subprocess.call('./main.icc.' + C + ' -fixed -sse -NMS 29 -fer 10000000 -min 0.50 -max 0.51 -iter 20 -timer 30 -thread 1 | grep "(PERF1) Total Kernel throughput"',shell=True)
subprocess.call('echo "xMS - 20 its - avx cores"',shell=True)
subprocess.call('./main.icc.' + C + ' -fixed -avx -OMS 1 -fer 10000000 -min 0.50 -max 0.51 -iter 20 -timer 30 -thread 1 | grep "(PERF1) Total Kernel throughput"',shell=True)
subprocess.call('./main.icc.' + C + ' -fixed -avx -NMS 29 -fer 10000000 -min 0.50 -max 0.51 -iter 20 -timer 30 -thread 1 | grep "(PERF1) Total Kernel throughput"',shell=True)
#!/usr/bin/python
import subprocess
import os
C_LIST = []
for root, dirs, files in os.walk("../src/Constantes/"):
for code in dirs:
C_LIST.append( code )
break
C_LIST=[
'576x288', '1944x972', '2048x384', '2304x1152', '4000x2000'
]
for C in C_LIST:
for COMPILER in GCC_LIST:
subprocess.call('echo "EXECUTION OF LDPC DECODER FOR CODE [' + C + ']"',shell=True)
subprocess.call('echo "xMS - 10 its - 1 core"',shell=True)
subprocess.call('./main.icc.' + C + ' -fixed -sse -NMS 29 -fer 10000000 -min 0.50 -max 0.51 -iter 20 -timer 30 -thread 1 | grep "(PERF1) Total Kernel throughput"',shell=True)
subprocess.call('./main.icc.' + C + ' -fixed -sse -NMS 29 -fer 10000000 -min 0.50 -max 0.51 -iter 20 -timer 30 -thread 2 | grep "(PERF2) Total Kernel throughput"',shell=True)
subprocess.call('./main.icc.' + C + ' -fixed -sse -NMS 29 -fer 10000000 -min 0.50 -max 0.51 -iter 20 -timer 30 -thread 4 | grep "(PERF4) Total Kernel throughput"',shell=True)
subprocess.call('echo "xMS - 10 its - 4 cores"',shell=True)
subprocess.call('./main.icc.' + C + ' -fixed -avx -NMS 29 -fer 10000000 -min 0.50 -max 0.51 -iter 20 -timer 30 -thread 1 | grep "(PERF1) Total Kernel throughput"',shell=True)
subprocess.call('./main.icc.' + C + ' -fixed -avx -NMS 29 -fer 10000000 -min 0.50 -max 0.51 -iter 20 -timer 30 -thread 2 | grep "(PERF2) Total Kernel throughput"',shell=True)
subprocess.call('./main.icc.' + C + ' -fixed -avx -NMS 29 -fer 10000000 -min 0.50 -max 0.51 -iter 20 -timer 30 -thread 4 | grep "(PERF4) Total Kernel throughput"',shell=True)
CODE 64800x6480.dvb-s2
CODE 64800x6480.dvb-s2
/**
Copyright (c) 2012-2015 "Bordeaux INP, Bertrand LE GAL"
[http://legal.vvv.enseirb-matmeca.fr]
This file is part of LDPC_C_Simulator.
LDPC_C_Simulator is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include "CBitGenerator.h"
CBitGenerator::CBitGenerator(CTrame *t, bool zero_only){
_vars = t->nb_vars();
t_in_bits = t->get_t_in_bits();
_zero_mode = zero_only;
for(int i=0; i<_vars; i++){
t_in_bits[i] = 0;
}
}
void CBitGenerator::generate(){
if( _zero_mode == false ){
for(int i=0; i<_vars; i++){
t_in_bits[i] = rand()%2;
}
}
}
/**
Copyright (c) 2012-2015 "Bordeaux INP, Bertrand LE GAL"
[http://legal.vvv.enseirb-matmeca.fr]
This file is part of LDPC_C_Simulator.
LDPC_C_Simulator is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef CLASS_CCbitGenerator
#define CLASS_CCbitGenerator
#include "../CTrame/CTrame.h"
class CBitGenerator
{
protected:
int _vars;
bool _zero_mode;
int* t_in_bits; // taille (var)
public:
CBitGenerator(CTrame *t, bool zero_only);
virtual void generate();
};
#endif
/**
Copyright (c) 2012-2015 "Bordeaux INP, Bertrand LE GAL"
[http://legal.vvv.enseirb-matmeca.fr]
This file is part of LDPC_C_Simulator.
LDPC_C_Simulator is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include "CChanel.h"
double CChanel::get_R(){
return R;
}
double CChanel::get_SigB(){
return SigB;
}
CChanel::~CChanel(){
}
CChanel::CChanel(CTrame *t, int _BITS_LLR, bool QPSK, bool ES_N0){
qbeta = 0.0;
R = 0.0;
_vars = t->nb_vars();
_data = t->nb_data();
_checks = t->nb_checks();
t_coded_bits = t->get_t_coded_bits();
t_noise_data = t->get_t_noise_data();
_frames = t->nb_frames();
BITS_LLR = _BITS_LLR;
qpsk = QPSK;
es_n0 = ES_N0;
normalize = false;
norm_factor = 0.0f;
}
void CChanel::setNormalize(bool enable){
normalize = enable;
}
/**
Copyright (c) 2012-2015 "Bordeaux INP, Bertrand LE GAL"
[http://legal.vvv.enseirb-matmeca.fr]
This file is part of LDPC_C_Simulator.
LDPC_C_Simulator is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef CLASS_CChanel
#define CLASS_CChanel
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "../CTrame/CTrame.h"
#define small_pi 3.1415926536
#define _2pi (2.0 * small_pi)
class CChanel
{
protected:
int _vars;
int _checks;
int _data;
int _frames;
int BITS_LLR;
int* data_in;
int* data_out;
bool qpsk;
bool es_n0;
bool normalize; // Normalize by 2/pow(sigma, 2)
float norm_factor;
float* t_noise_data; // taille (width)
int* t_coded_bits; // taille (width)
double rendement;
double SigB;
double Gauss;
double Ph;
double Qu;
double Eb_N0;
double qbeta;
double R;
public:
CChanel(CTrame *t, int _BITS_LLR, bool QPSK, bool Es_N0);
virtual ~CChanel();
virtual void configure(double _Eb_N0) = 0; // VIRTUELLE PURE
virtual double get_R();
virtual double get_SigB();
virtual void setNormalize(bool enable);
virtual void generate() = 0; // VIRTUELLE PURE
};
#endif
/**
Copyright (c) 2012-2015 "Bordeaux INP, Bertrand LE GAL"
[http://legal.vvv.enseirb-matmeca.fr]
This file is part of LDPC_C_Simulator.
LDPC_C_Simulator is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include "CChanelAWGN_MKL.h"
#ifndef VSL_METHOD_SGAUSSIAN_BOXMULLER2
#define VSL_METHOD_SGAUSSIAN_BOXMULLER2 1
#endif
//
// RACINE CARREE SSE OPTIMISEE A L'AIDE DE LA FONCTION
// RECIPROQUE (PRECISION DE 11 BITS SUR LA MANTISSE)
//
//inline __m256 sqrt_sse_11bits( __m256 a )
//{
// return _mm256_mul_ps( a, _mm256_rsqrt_ps( a ) );
//}
double CChanelAWGN_MKL::inv_erf(int v){
if (v == 3) {
return 0.86312;
}else if(v == 4){
return 1.1064;
}else if(v == 5){
return 1.3268;
}else if(v == 6){
return 1.5274;
}else if(v == 7){
return 1.7115;
}else if(v == 8){
return 1.8819;
}else if(v == 9){
return 2.0409;
}else if(v == 10){
return 2.1903;
}
return -1;
}
double CChanelAWGN_MKL::get_R(){
return R;
}
#define AVX_8F_LOAD(ptr) (_mm256_load_ps(ptr))
#define AVX_8F_STORE(ptr,val) (_mm256_store_ps(ptr,val))
#define AVX_8F_SQRT(a) (_mm256_sqrt_ps(a))
#define AVX_8F_ADD(a,b) (_mm256_add_ps(a,b))
#define AVX_8F_SUB(a,b) (_mm256_sub_ps(a,b))
#define AVX_8F_MUL(a,b) (_mm256_mul_ps(a,b))
#define AVX_8F_LOG(a) (_mm256_log_ps(a))
#define AVX_8F_DIV(a,b) (_mm256_div_ps(a,b))
#define AVX_8F_SET1(a) (_mm256_set1_ps(a))
#define AVX_8F_SET1i(a) (_mm256_set1_epi32(a))
#define AVX_8F_CONV(a) (_mm256_cvtepi32_ps(a))
#define AVX_8F_SETi(a,b,c,d,e,f,g,h) (_mm256_set_epi32(a,b,c,d,e,f,g,h))
static int thread_id = 0;
CChanelAWGN_MKL::CChanelAWGN_MKL(CTrame *t, int _BITS_LLR, bool QPSK, bool Es_N0)
: CChanel(t, _BITS_LLR, QPSK, Es_N0){
int status = vslNewStream( &stream, VSL_BRNG_MT2203 + thread_id++ /*VSL_BRNG_MT2203*/, rand() );
if( status != VSL_STATUS_OK ){
printf("(EE) Error during vslNewStream execution\n");
printf("(EE) thread_id = %d\n", thread_id);
exit( 0 );
}
noise = (float*)new __m128[_frames * _data / 4];
}
CChanelAWGN_MKL::~CChanelAWGN_MKL(){
vslDeleteStream( &stream );
delete noise;
thread_id--;
}
void CChanelAWGN_MKL::configure(double _Eb_N0){
rendement = (float) (_vars) / (float) (_data);
if (es_n0) {
// ES/N0 = Eb/N0 + 10*log10(R*m)
// o√π R = rendement
// m = nombre de bits par symbole de constellation (QPSK => 2)
// Eb/N0 et ES/N0 sont en dB
Eb_N0 = _Eb_N0 - 10.0 * log10(2 * rendement);
} else {
Eb_N0 = _Eb_N0;
}
double interm = 10.0 * log10(rendement);
interm = -0.1*((double)Eb_N0+interm);
SigB = sqrt(pow(10.0,interm)/2);
qbeta = SigB * sqrt(2.0) * inv_erf( BITS_LLR - 1 ); // PATCH CEDRIC MARCHAND
R = (1.0 + qbeta);
//