Commit 15423cb2 authored by MartinFIT's avatar MartinFIT

Removed How to, added README

parent e9cffed5
......@@ -31,8 +31,6 @@ services:
image: sequenceiq/hadoop-docker:2.7.1
container_name: hadoop
network_mode: "bridge"
#environment:
# CORE_CONF_fs_defaultFS: hdfs://192.168.99.100:9000
ports:
- 8020:8020
- 8042:8042
......
Docker:
Sdilene adresare
sudo vi /mnt/sda1/var/lib/boot2docker/profile
pridat
mkdir /home/docker/Users
sudo mount -t vboxsf -o uid=1000,gid=50 c/Users /home/docker/Users
export COMPOSE_CONVERT_WINDOWS_PATHS=1
mkdir /home/docker/Docker
sudo mount -t vboxsf -o uid=1000,gid=50 c/Docker /home/docker/Docker
Stats
docker stats --format "table {{.ID}}\t{{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}\t{{.BlockIO}}"
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}\t{{.BlockIO}}"
Maven:
Spusteni maven clean install
$ docker run -it --rm --name my-maven-project -v "$PWD":/usr/src/mymaven -v "$HOME/.m2":/root/.m2 -w /usr/src/mymaven martinfit/maven:3.5.2-jdk-9 mvn clean install
$ docker run -it --rm --name communication -v "$HOME/.m2":/root/.m2 -v "$PWD":/usr/src/app -w "/usr/src/app/" martinfit/maven:3.5.2-jdk-9-slim mvn clean install
$ docker run -it --rm --name persistence -v "$HOME/.m2":/root/.m2 -v "$PWD":/usr/src/app -w "/usr/src/app/" martinfit/maven:3.5.2-jdk-9-slim mvn clean install
$ docker run -it --rm --name producer-app -v "$PWD":/usr/src/app -v "$HOME/.m2":/root/.m2 -w "/usr/src/app/" martinfit/maven:3.5.2-jdk-9-slim mvn clean install
$ docker run -it --rm --name repository -v "$PWD":/usr/src/app -v "$HOME/.m2":/root/.m2 -w "/usr/src/app/" martinfit/maven:3.5.2-jdk-9-slim mvn clean install
Spusteni jarky
$ docker run -it --rm --name producer-app -v "$PWD":/usr/src/app -v "$HOME/.m2":/root/.m2 -w "/usr/src/app/" martinfit/maven:3.5.2-jdk-9-slim java -cp producer-app-1.0-SNAPSHOT.jar -jar target/producer-app-1.0.jar target/classes/PCAP
$ docker run -it --rm --name distributed-repository -v "$PWD":/usr/src/app -v "$HOME/.m2":/root/.m2 -w "/usr/src/app/" martinfit/maven:3.5.2-jdk-9-slim java -cp distributed-repository-1.0-SNAPSHOT.jar -jar target/distributed-repository-1.0.jar
$ docker run -it --rm --name persistence -v "$PWD":/usr/src/app -v "$HOME/.m2":/root/.m2 -w "/usr/src/app/" martinfit/maven:3.5.2-jdk-9-slim java -cp persistence-1.0-SNAPSHOT.jar -jar target/persistence-1.0.jar
Cassandra:
Otevreni CSLSH
$ docker run -it --link cassandra:cassandra --rm martinfit/cassandra:3 cqlsh cassandra
MongoDB:
Otevreni shell
$ docker run -it --link mongodb:mongo --rm mongo:3.4 mongo 192.168.99.100:27017
Vypsani databazi
> show dbs
Pouziti databaze
> use metadata
Vypsani kolekci v databazi
> db.getCollectionNames()
Vypsani obsahu kolekce
> db.packet_metadata.find().forEach(printjson)
Pocet polozek v kolekci
> db.packet_metadata.count()
Smazani kolekce
> db.packet_metadata.drop()
Kafka:
Vytvoreni topicu
$ docker exec kafka kafka-topics.sh --create --zookeeper 192.168.99.100:2181 --replication-factor 1 --partitions 1 --topic firsttopic
Created topic "firsttopic".
Konfigurace topicu
http://kafka.apache.org/documentation/ pod "3.2 Topic-Level Configs"
$ docker exec kafka kafka-configs.sh --zookeeper 192.168.99.100:2181 --entity-type topics --entity-name input_topic --alter --add-config max.message.bytes=1000000000
Vypsani topicu
$ docker exec kafka kafka-topics.sh --describe --zookeeper 192.168.99.100:2181
Topic:firsttopic PartitionCount:1 ReplicationFactor:1 Configs:
Topic: firsttopic Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Spusteni producer skriptu
$ docker exec -it kafka kafka-console-producer.sh --broker-list 192.168.99.100:9092 --topic firsttopic
Spusteni consumer skriptu
$ docker exec -it kafka kafka-console-consumer.sh --bootstrap-server 192.168.99.100:9092 --topic firsttopic
Hadoop:
Uprava core-site.xml:
HADOOP_PREFIX = /usr/local/hadoop
docker cp hadoop:/usr/local/hadoop/etc/hadoop/core-site.xml .
-- Uprava ve windows editoru
docker cp core-site.xml hadoop:/usr/local/hadoop/etc/hadoop/core-site.xml
Kontrola beziciho HDFS:
$HADOOP_PREFIX/bin/hadoop dfsadmin -report
Kontrola IP namenode:
echo $($HADOOP_PREFIX/bin/hdfs getconf -namenodes)
Ulozeni souboru do HDFS
bash-4.1# bin/hdfs dfs -copyFromLocal input/file01 input/file01
Spusteni ukazkoveho programu (Adresar /usr/local/hadoop/input je input directory v HDFS)
$ docker exec -it hadoop bash
bash-4.1# cd $HADOOP_PREFIX
bash-4.1# ls
LICENSE.txt NOTICE.txt README.txt bin etc include input lib libexec logs sbin share
bash-4.1# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar wordcount input/file01 output/file01
17/11/19 07:29:50 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/11/19 07:29:51 INFO input.FileInputFormat: Total input paths to process : 1
17/11/19 07:29:51 INFO mapreduce.JobSubmitter: number of splits:1
17/11/19 07:29:51 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1511088777765_0007
17/11/19 07:29:52 INFO impl.YarnClientImpl: Submitted application application_1511088777765_0007
17/11/19 07:29:52 INFO mapreduce.Job: The url to track the job: http://c01ff504b24e:8088/proxy/application_1511088777765_0007/
17/11/19 07:29:52 INFO mapreduce.Job: Running job: job_1511088777765_0007
17/11/19 07:30:01 INFO mapreduce.Job: Job job_1511088777765_0007 running in uber mode : false
17/11/19 07:30:01 INFO mapreduce.Job: map 0% reduce 0%
17/11/19 07:30:10 INFO mapreduce.Job: map 100% reduce 0%
17/11/19 07:30:21 INFO mapreduce.Job: map 100% reduce 100%
17/11/19 07:30:22 INFO mapreduce.Job: Job job_1511088777765_0007 completed successfully
17/11/19 07:30:22 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=52
FILE: Number of bytes written=230403
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=134
HDFS: Number of bytes written=30
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=7206
Total time spent by all reduces in occupied slots (ms)=7596
Total time spent by all map tasks (ms)=7206
Total time spent by all reduce tasks (ms)=7596
Total vcore-seconds taken by all map tasks=7206
Total vcore-seconds taken by all reduce tasks=7596
Total megabyte-seconds taken by all map tasks=7378944
Total megabyte-seconds taken by all reduce tasks=7778304
Map-Reduce Framework
Map input records=1
Map output records=4
Map output bytes=38
Map output materialized bytes=52
Input split bytes=112
Combine input records=4
Combine output records=4
Reduce input groups=4
Reduce shuffle bytes=52
Reduce input records=4
Reduce output records=4
Spilled Records=8
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=216
CPU time spent (ms)=1340
Physical memory (bytes) snapshot=298913792
Virtual memory (bytes) snapshot=1391685632
Total committed heap usage (bytes)=137252864
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=22
File Output Format Counters
Bytes Written=30
Ukazka vysledku
bash-4.1# bin/hdfs dfs -ls output/
Found 1 items
drwxr-xr-x - root supergroup 0 2017-11-19 07:30 output/file01
bash-4.1# bin/hdfs dfs -cat output/file01/part-r-00000
\ No newline at end of file
Navod pro ovladani Docker prostredi
Pozadavky
- mit nainstalovany Docker (Docker Toolbox) a Oracle VM VirtualBox,
je dost mozne, ze pod Linuxem bude potreba mirne upravit konfiguraci technologii (v Environment/docker-compose.yml)
- spusteni prostredi Docker napr. pomoci Docker Quickstart Terminal
Vytvoreni virtualni masiny
- virtualni masina by mela byt vytvorena pri instalaci Docker, ale
lze ji i znovu vytvorit spustenim skriptu docker-machine-recreate.sh
Stazeni technologii v prostredi Docker
- spusteni skriptu install-docker-enviroment.sh, ktery provede stazeni obrazu
technologii z DockerHub: Cassandra, Kafka, ZooKeeper, MongoDB, Hadoop
Spusteni technologii v prostredi Docker
- spusteni skriptu run-docker-enviroment.sh, spusti vsechny vyse zminene technologie
Ukonceni behu technologii v prostredi Docker
- spusteni skriptu stop-docker-enviroment.sh
Navod pro ovladani technologii v Docker prostredi
V zavislosti na pouziti Docker nebo Docker Toolbox se pouzivaji ruzne IP adresy:
- Docker localhost
- Docker Toolbox IP virtualni masiny (192.168.99.100)
Kafka
Vytvoreni topicu
$ docker exec kafka kafka-topics.sh --create --zookeeper 192.168.99.100:2181 --replication-factor 1 --partitions 1 --topic input_topic
Created topic "input_topic".
Konfigurace topicu
http://kafka.apache.org/documentation/ pod "3.2 Topic-Level Configs"
$ docker exec kafka kafka-configs.sh --zookeeper 192.168.99.100:2181 --entity-type topics --entity-name input_topic --alter --add-config max.message.bytes=1000000000
Vypsani topicu
$ docker exec kafka opt/kafka/bin/kafka-topics.sh --describe --zookeeper 192.168.99.100:2181
Topic:input_topic PartitionCount:1 ReplicationFactor:1 Configs:
Topic: input_topic Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Spusteni demo producer skriptu
$ docker exec -it kafka kafka-console-producer.sh --broker-list 192.168.99.100:9092 --topic input_topic
Spusteni demo consumer skriptu
$ docker exec -it kafka kafka-console-consumer.sh --bootstrap-server 192.168.99.100:9092 --topic input_topic
Cassandra
Otevreni CSLSH
$ docker run -it --link cassandra:cassandra --rm martinfit/cassandra:3 cqlsh cassandra
MongoDB
Otevreni Mongo shell
$ docker run -it --link mongodb:mongo --rm mongo:3.4 mongo 192.168.99.100:27017
Distribuovany repositar
Implementace se sklada ze ctyr Maven modulu:
- Communication rozhrani pro komunikaci
- Persistence obsluha Cassandry a MongoDB
- DistributedRepository system distribuovaneho repositare, po spusteni bezi nepretrzite
- ProducerDemo klientska aplikace
Prvni je potreba nainstalovat moduly Communication a Persistence, protoze ostatni dve aplikace jej pouzivaji jako dependency.
cd Communication
./install.sh
cd Persistence
./install.sh
Pote je potreba nainstalovat zbyle dve aplikace.
cd DistributedRepository
./install.sh
cd ProducerDemo
./install.sh
Jedna se o projekty vytvorene ve vyvojovem prostredi IntelliJ IDEA.
Pro IDEA existuje plugin Docker, kde lze videt po pripojeni obrazy a bezici kontejnery, zaroven lze sledovat jejich nastaveni a logy.
# Master-Thesis
\ No newline at end of file
# Master-Thesis
Docker prostredi nachazejici se v adresari Docker
Pozadavky
- Mit nainstalovany Docker (Docker Toolbox a Oracle VM VirtualBox pod Windows),
- Aktualni konfigurace v Environment/docker-compose.yml je pro prostredi
Docker pod Windows, pod Linuxem bude potreba mirne upravit konfiguraci
(IP adresu virtualniho stroje zamenit za localhost apod.)
- Pod Windows: spustit Docker daemona napr. pomoci Docker Quickstart Terminal
Vytvoreni virtualni masiny
- Virtualni masina by mela byt vytvorena pri instalaci Docker Toolbox, ale
lze ji i znovu vytvorit spustenim skriptu docker-machine-recreate.sh
(pozor na parametry stroje - lze nastavit velikost RAM, pocet CPU jader, velikost disku)
Stazeni technologii v prostredi Docker
- Spusteni skriptu install-docker-enviroment.sh, ktery provede stazeni obrazu
technologii z DockerHub: Cassandra, Kafka, ZooKeeper, MongoDB, Hadoop
Spusteni technologii v prostredi Docker
- Spusteni skriptu run-docker-enviroment.sh, spusti vsechny vyse zminene technologie
Ukonceni behu technologii v prostredi Docker
- Spusteni skriptu stop-docker-enviroment.sh
Statistiky bezicich kontejneru (vyuziti RAM, CPU, ...) lze sledovat v prehledne tabulce pomoci prikazu
$ docker stats --format "table {{.ID}}\t{{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}\t{{.BlockIO}}"
$ docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}\t{{.BlockIO}}"
V zavislosti na pouziti Docker nebo Docker Toolbox se pouzivaji ruzne IP adresy:
- Docker localhost
- Docker Toolbox IP virtualniho stroje (192.168.99.100)
Pristup do beziciho kontejneru
- Cassandra
Otevreni CSLSH
$ docker run -it --link cassandra:cassandra --rm martinfit/cassandra:3 cqlsh cassandra
- MongoDB
Otevreni Mongo shell
$ docker run -it --link mongodb:mongo --rm mongo:3.4 mongo 192.168.99.100:27017
Kafka
- Kafka fronty jsou vytvoreny automaticky, pripadne je lze vytvorit nasledujicim prikazem:
$ docker exec kafka kafka-topics.sh --create --zookeeper 192.168.99.100:2181 \
--replication-factor 1 --partitions 1 --topic input_topic
Created topic "input_topic".
- Vypsani topicu
$ docker exec kafka opt/kafka/bin/kafka-topics.sh --describe --zookeeper 192.168.99.100:2181
Topic:input_topic PartitionCount:1 ReplicationFactor:1 Configs:
Topic: input_topic Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Hadoop/HDFS
- Aplikace potrebuji pristoupit primo do kontejneru dle adresy kontejneru
(typicky adresa beziciho kontejneru vypada nasledovne: 172.17.0.4),
pod Windows je potreba pridat zaznam do smerovaci tabulky pres prikazovou radku jako admin:
route add 172.17.0.0/16 192.168.99.100
route add 172.18.0.0/16 192.168.99.100
atd., z duvodu pritomnosti virtualniho stroje, pod Linuxem neni potreba
System distribuovaneho uloziste
Implementace se sklada ze ctyr Maven modulu:
- Communication rozhrani pro komunikaci
- Persistence obsluha Cassandry a MongoDB
- DistributedRepository system distribuovaneho repositare, po spusteni bezi nepretrzite
- ProducerDemo klientska aplikace
Prvni je potreba nainstalovat moduly Communication a Persistence,
protoze ostatni dve aplikace jej pouzivaji jako dependency.
Instalace probiha v tomto poradi:
cd Communication
./install.sh
cd Persistence
./install.sh
cd DistributedRepository
./install.sh
cd ProducerDemo
./install.sh
Jedna se o projekty vytvorene ve vyvojovem prostredi IntelliJ IDEA.
Pro IDEA existuje plugin Docker, kde lze videt po pripojeni obrazy
a bezici kontejnery, zaroven lze sledovat jejich nastaveni a logy.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment