Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
I
IsomapSpark
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Locked Files
Issues
0
Issues
0
List
Boards
Labels
Service Desk
Milestones
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Commits
Issue Boards
Open sidebar
SCoRe-Group
IsomapSpark
Commits
5e91c102
Commit
5e91c102
authored
Aug 31, 2018
by
Frank Schoeneman
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Remove old files and directories.
parent
323aed95
Changes
4
Expand all
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
0 additions
and
2034 deletions
+0
-2034
Isomap_Spark.py
PySpark/Isomap_Spark.py
+0
-630
SparkIsomap.py
PySpark/SparkIsomap.py
+0
-626
bknn-slurm-spark.sh
PySpark/bknn-slurm-spark.sh
+0
-108
brute_kNN.py
PySpark/brute_kNN.py
+0
-670
No files found.
PySpark/Isomap_Spark.py
deleted
100644 → 0
View file @
323aed95
This diff is collapsed.
Click to expand it.
PySpark/SparkIsomap.py
deleted
100644 → 0
View file @
323aed95
This diff is collapsed.
Click to expand it.
PySpark/bknn-slurm-spark.sh
deleted
100644 → 0
View file @
323aed95
#!/bin/bash
####### PLANEX SPECIFIC - DO NOT EDIT THIS SECTION
#SBATCH --clusters=mae
#SBATCH --partition=planex
#SBATCH --account=pi-jzola
#SBATCH --exclusive
#SBATCH --mem=64000
####### CUSTOMIZE THIS SECTION FOR YOUR JOB
####### NOTE: --ntasks-per-node SHOULD BE SET TO INCLUDE ALL CORES IN A NODE
####### YOU CAN CONTROL CORE-TO-EXECUTOR RATIO VIA SPARK_ARGS
#SBATCH --job-name="spark"
#SBATCH --nodes=5
#SBATCH --ntasks-per-node=20
#SBATCH --output=%j.stdout
#SBATCH --error=%j.stderr
#SBATCH --time=24:00:00
# IF SET TO 1 SPARK MASTER RUNS ON A SEPARATE NODE
exclude_master
=
1
# IF SET TO 1 SCRATCH AND TMP WILL BE RM -RF (RECOMMENDED)
nodes_clean
=
1
# MAKE SURE THAT SPARK_LOG_DIR, SPARK_LOCAL_DIRS
# AND SPARK_WORKER_DIR
# ARE SET IN YOUR BASHRC, FOR EXAMPLE:
# export SPARK_LOG_DIR=/scratch/
# export SPARK_LOCAL_DIRS=/scratch/
# export SPARK_WORKER_DIR=/scratch/
# ADD EXTRA MODULES HERE IF NEEDED
# YOU MAY WANT TO CHANGE SPARK VERSION
module load java/1.8.0_45
module load hadoop/2.6.0
#module load spark/2.2.0
SPARK_HOME
=
/projects/academic/jzola/fvschoen/spark-2.2.1
module load mkl
source
$MKL
/bin/mklvars.sh intel64
export
LD_LIBRARY_PATH
=
`
(
pwd
)
`
:
$LD_LIBRARY_PATH
module list
echo
$LD_LIBRARY_PATH
# SET YOUR COMMAND AND ARGUMENTS
#PROG="rrpknn_list.py"
PROG
=
"brute_kNN.py"
DATA_PATH
=
/projects/academic/jzola/fvschoen/Isomap-Spark/PySpark/data/test_50K_1000d.tsv
ARGS
=
"-p 100 -k 10 -n 50000 -d 1000 -f "
$DATA_PATH
" -o testout"
# SET EXTRA OPTIONS TO spark-submit
# EXAMPLE OPTIONS:
# --num-executors
# --executor-cores
# --executor-memory
# --driver-cores
# --driver-memory
# --py-files
SPARK_ARGS
=
"--conf spark.driver.maxResultSize=20g --conf
\"
spark.locality.wait=0.5
\"
--driver-memory 40G --driver-cores 20 --executor-memory 55G --num-executors 4 --executor-cores 20"
####### DO NOT EDIT BELOW
SPARK_PATH
=
$SPARK_HOME
# GET LIST OF NODES
NODES
=(
`
srun
hostname
|
sort
|
uniq
`
)
NUM_NODES
=
${#
NODES
[@]
}
LAST
=
$((
NUM_NODES
-
1
))
# FIRST NODE IS MASTER
ssh
${
NODES
[0]
}
"cd
$SPARK_PATH
; ./sbin/start-master.sh"
MASTER
=
"spark://
${
NODES
[0]
}
:7077"
WHO
=
`
whoami
`
echo
-e
"you can use this:
\n
ssh
$WHO
@rush.ccr.buffalo.edu -L 4040:
${
NODES
[0]
}
:4040 -N
\n
to enable local dasboard"
TEMP_OUT_DIR
=
$SLURM_SUBMIT_DIR
/
$SLURM_JOB_ID
-spark
# save history log to spark dir
ARGS
=
$ARGS
" -e "
$TEMP_OUT_DIR
# ALL NODES ARE WORKERS
mkdir
-p
$TEMP_OUT_DIR
for
i
in
`
seq
$exclude_master
$LAST
`
;
do
ssh
${
NODES
[
$i
]
}
"cd
$SPARK_PATH
; nohup ./bin/spark-class org.apache.spark.deploy.worker.Worker
$MASTER
&>
$TEMP_OUT_DIR
/nohup-
${
NODES
[
$i
]
}
.
$i
.out"
&
done
# SUBMIT JOB
$SPARK_PATH
/bin/spark-submit
--master
$MASTER
$SPARK_ARGS
$PROG
$ARGS
# CLEAN SPARK JOB
ssh
${
NODES
[0]
}
"cd
$SPARK_PATH
; ./sbin/stop-master.sh"
for
i
in
`
seq
0
$LAST
`
;
do
ssh
${
NODES
[
$i
]
}
"killall java"
done
if
[
$nodes_clean
-eq
1
]
;
then
for
i
in
`
seq
0
$LAST
`
;
do
ssh
${
NODES
[
$i
]
}
"find /scratch ! -path /scratch/
$SLURM_JOB_ID
-user
$(
whoami
)
-delete; find /tmp ! -path /tmp/
$SLURM_JOB_ID
-user
$(
whoami
)
-delete"
done
fi
PySpark/brute_kNN.py
deleted
100644 → 0
View file @
323aed95
This diff is collapsed.
Click to expand it.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment