- Example workflow description
- Job Arrays (low overhead)
- SnakeMake (readability prioritized)
- Pegasus (portability prioritized)
- Makeflow (simplicity prioritized)
- Questions
- Lab/Project time
2022-04-15
git clone https://github.com/uschpc/workshop-workflows.gitindex.html in browserexamples-wget --no-check-certificate https://g-3d96ec.a78b8.36fe.data.globus.org/symphonie_fantastique.tar (download tar file)symphonie_fantastique.tar, you can extract with tar xfv symphonie_fantastique.tar
data directoryinput.flac song file as inputffmpeg
input.flacoutput.mp4output output/<song_name>/<song_name>.mp4output/<song_name>/images/*.pngrosa_fft.pyrosa_fft.py
python3 rosa_fft.py -f 09Teachers.flac -i 172 -d 4 -o 09Teachers| option | meaning |
|---|---|
| -f | which song file to read from (doesn’t have to be .flac format) |
| -i | initial time to start reading data |
| -d | how many seconds of song to read (default is 1) |
| -o | which directory to save images |
ffmpegffmpeg -threads 8 -framerate 60 -i 16Funk_Ad/images/frame%d.png -i 16Funk_Ad.flac -pix_fmt yuv420p 16Funk_Ad.mp4| option | meaning |
|---|---|
| -threads | How many cpus to use when converting fiels |
| -framerate | How many pictures per second |
| -i | name or name pattern for input file(s) |
| -pix_fmt yuv420p | video encoding option? |
| output_file.mp4 | output file name |
rosa_fft_array.slurm:
#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --mem-per-cpu-1GB
#SBATCH --time=00:10:00
#SBATCH --array=0-9
module load python ffmpeg
# How many seconds to process per job
duration=4
time_index=$((SLURM_ARRAY_TASK_ID*duration))
song_file="data/symphonie_fantastique/berlioz_symphonie_2_un_bal_vals.mp3"
out_dir="output/berlioz_symphonie_2_un_bal_vals"
echo "Processing time_index:${time_index}"
python3 ./scripts/rosa_fft.py -f ${song_file} \
-i ${time_index} -t ${duration} -o ${out_dir}
$SLURM_ARRAY_TASK_ID will be any number from 0-9examples/job_arraypython3 ./scripts/rosa_fft.py -f ${song_file} \
-i ${time_index} -t ${duration} -o ${out_dir}
| Variable | meaning | value |
|---|---|---|
${song_file} |
Song file to process | data/symphonie_fantastique/berlioz_symphonie_2_un_bal_vals.mp3 |
$SLURM_ARRAY_TASK_ID |
Which array element are we? | 0-9 |
${duration} |
How many seconds to process? | 4 |
${time_index} |
Time to start processing | $((SLURM_ARRAY_TASK_ID*duration)) |
${out_dir} |
Where to save output data | output/berlioz_symphonie_2_un_bal_vals |
examples/job_array/ffmpeg.slurmffmpeg.slurm:
#!/bin/bash
#SBATCH --cpus-per-task=4
#SBATCH --mem-per-cpu=1GB
#SBATCH --time=00:10:00
module load ffmpeg
song_file="data/symphonie_fantastique/berlioz_symphonie_2_un_bal_vals.mp3"
frame_dir="output/berlioz_symphonie_2_un_bal_vals/images"
out_file="output/berlioz_symphonie_2_un_bal_vals.mp4"
ffmpeg \
-framerate 60 -i ${frame_dir}/frame%d.png \
-i ${song_file} \
-pix_fmt yuv420p \
-threads ${SLURM_CPUS_PER_TASK} \
${out_file}
rosa_fft_array.slurm completes.bash we can capture the job id
Submitted batch job XXXXXcut to parse textexamples/job_array/manager.sh:
#!/bin/bash
# save 4th word using ' ' as delmiter
# AKA the job id
jid=$(sbatch examples/job_array/job_array.slurm | cut -d ' ' -f 4)
sbatch --dependency=afterok:${jid} examples/job_array/ffmpeg.slurm
rule bwa:
input:
"data/genome.fa",
"data/samples/{sample}.fastq"
output:
temp("mapped/{sample}.bam")
conda:
"envs/mapping.yaml"
threads: 8
shell:
"bwa mem -t {threads} {input} | samtools view -Sb - > {output}"
MakeFile used in building softwareoutputfile(s): inputfile(s)
# Leading whitespace below must be tab character, not spaces!
command to generate outputfile(s)
CURL=/usr/bin/curl
CONVERT=/usr/bin/convert
URL="http://ccl.cse.nd.edu/images/capitol.jpg"
capitol.anim.gif: capitol.jpg capitol.90.jpg capitol.180.jpg capitol.270.jpg capitol.360.jpg
LOCAL $(CONVERT) -delay 10 -loop 0 capitol.jpg capitol.90.jpg capitol.180.jpg capitol.270.jpg capitol.360.jpg capitol.270.jpg capitol.180.jpg capitol.90.jpg capitol.anim.gif
capitol.90.jpg: capitol.jpg
$(CONVERT) -swirl 90 capitol.jpg capitol.90.jpg
capitol.180.jpg: capitol.jpg
$(CONVERT) -swirl 180 capitol.jpg capitol.180.jpg
capitol.270.jpg: capitol.jpg
$(CONVERT) -swirl 270 capitol.jpg capitol.270.jpg
capitol.360.jpg: capitol.jpg
$(CONVERT) -swirl 360 capitol.jpg capitol.360.jpg
capitol.jpg:
LOCAL $(CURL) -o capitol.jpg $(URL)
./berlioz_symphonie_1_reveries_pa/berlioz_symphonie_1_reveries_pa.mp4: ./berlioz_symphonie_1_reveries_pa/images/frame000000000.png ./berlioz_symphonie_1_reveries_pa/images/frame000000001.png ./berlioz_symphonie_1_reveries_pa/images/frame000000002.png ./berlioz_symphonie_1_reveries_pa/images/frame000000003.png ./berlioz_symphonie_1_reveries_pa/images/frame000000004.png ./berlioz_symphonie_1_reveries_pa/images/frame000000005.png ./berlioz_symphonie_1_reveries_pa/images/frame000000006.png ./berlioz_symphonie_1_reveries_pa/images/frame000000007.png ./berlioz_symphonie_1_reveries_pa/images/frame000000008.png ./berlioz_symphonie_1_reveries_pa/images/frame000000009.png ./berlioz_symphonie_1_reveries_pa/images/frame000000010.png ./berlioz_symphonie_1_reveries_pa/images/frame000000011.png ./berlioz_symphonie_1_reveries_pa/images/frame000000012.png ./berlioz_symphonie_1_reveries_pa/images/frame000000013.png ./berlioz_symphonie_1_reveries_pa/images/frame000000014.png ./berlioz_symphonie_1_reveries_pa/images/frame000000015.png
fmpeg -threads 4 -framerate 60 -i Homework/04Da_Funk/images/frame%09d.png -i data/Homework/04Da_Funk.flac -pix_fmt yuv420p Homework/04Da_Funk/04Da_Funk.mp4
./berlioz_symphonie_1_reveries_pa/images/frame000000000.png ./berlioz_symphonie_1_reveries_pa/images/frame000000001.png ./berlioz_symphonie_1_reveries_pa/images/frame000000002.png ./berlioz_symphonie_1_reveries_pa/images/frame000000003.png ./berlioz_symphonie_1_reveries_pa/images/frame000000004.png ./berlioz_symphonie_1_reveries_pa/images/frame000000005.png ./berlioz_symphonie_1_reveries_pa/images/frame000000006.png ./berlioz_symphonie_1_reveries_pa/images/frame000000007.png ./berlioz_symphonie_1_reveries_pa/images/frame000000008.png ./berlioz_symphonie_1_reveries_pa/images/frame000000009.png ./berlioz_symphonie_1_reveries_pa/images/frame000000010.png ./berlioz_symphonie_1_reveries_pa/images/frame000000011.png ./berlioz_symphonie_1_reveries_pa/images/frame000000012.png ./berlioz_symphonie_1_reveries_pa/images/frame000000013.png
python3 /scripts/rosa_fft.py -f /project/hpcroot/csul/workshop-workflows/data/symphonie_fantastique/berlioz_symphonie_1_reveries_pa.mp3 -i 1 -o ./berlioz_symphonie_1_reveries_pa
"rules":[
{
"outputs":["input.txt"],
"command":"echo \"Hello Makeflow!\" > input.txt",
"local_job":true,
},
{
"outputs":[format("output.%d",i)],
"inputs":["simulation.py","input.txt"],
"command":format("./simulation.py %d < input.txt > output.%d", i, i),
} for i in range(1,5)
],
}
"define":
{
"OUTDIR":"output/berlioz_symphonie_2_un_bal_vals/images",
"SONGNAME" : "berlioz_symphonie_2_un_bal_vals",
"FFMPEG_THREADS" : 4,
"FRAME_RATE" : 60,
"SONG_DURATION": 370
},
"rules":[
# Merge all frames into 1 video
{
"command" : format("ffmpeg -threads %d -framerate %d -i output/%s/images/frame\%%d.png -i ../../data/symphonie_fantastique/%s.mp3 output/%s/%s.mp4",FFMPEG_THREADS,FRAME_RATE,SONGNAME,SONGNAME,SONGNAME,SONGNAME),
"inputs" : [ format("output/%s/images/frame%d.png",SONGNAME, x) for x in range(0,SONG_DURATION*FRAME_RATE)],
"output" : [format("output/%s/%s.mp4",SONGNAME,SONGNAME)],
},{
"command" : format("python3 ../../scripts/rosa_fft.py -f ../../data/symphonie_fantastique/%s.mp3 -i "+N+" -o output/%s",SONGNAME,SONGNAME),
"inputs" : ["../../scripts/rosa_fft.py" , format("../../data/symphonie_fantastique/%s.mp3",SONGNAME)],
"outputs":
[
format("%s/frame%d.png",OUTDIR,x) for x in range(N*FRAME_RATE,(N+1)*FRAME_RATE)
]
} for N in range(0,370)
]
}
"define":
{
"OUTDIR":"output/berlioz_symphonie_2_un_bal_vals/images",
"SONGNAME":"berlioz_symphonie_2_un_bal_vals",
"FFMPEG_THREADS":4,
"FRAME_RATE":60,
"SONG_DURATION":370
},
"rules":
[
{
"command":"ffmpeg -threads 4 -framerate 60 -i output/berlioz_symphonie_2_un_bal_vals/images/frame%d.png -i ../../data/symphonie_fantastique/berlioz_symphonie_2_un_bal_vals.mp3 output/berlioz_symphonie_2_un_bal_vals/berlioz_symphonie_2_un_bal_vals.mp4",
"inputs":
[
"output/berlioz_symphonie_2_un_bal_vals/images/frame0.png",
"output/berlioz_symphonie_2_un_bal_vals/images/frame1.png",
.
.
.
"output/berlioz_symphonie_2_un_bal_vals/images/frame22199.png"
],
"output":
"output/berlioz_symphonie_2_un_bal_vals/berlioz_symphonie_2_un_bal_vals.mp4"
}
salloc --ntasks=1 --cpus-per-task=8 --time=1:00:00 --mem-per-cpu=2GBexamples/clip/berlioz_symphonie_2_un_bal_vals.jx$ makeflow --jx clip_berlioz_symphonie_2_un_bal_vals.jx parsing clip_berlioz_symphonie_2_un_bal_vals.jx... local resources: 32 cores, 191863 MB memory, 1438549300 MB disk max running local jobs: 32 checking clip_berlioz_symphonie_2_un_bal_vals.jx for consistency... clip_berlioz_symphonie_2_un_bal_vals.jx has 4 rules. creating new log file clip_berlioz_symphonie_2_un_bal_vals.jx.makeflowlog... checking files for unexpected changes... (use --skip-file-check to skip this step) starting workflow.... submitting job: python3 ../../scripts/rosa_fft.py -f ../../data/symphonie_fantastique/berlioz_symphonie_2_un_bal_vals.mp3 -i 111 -o output/berlioz_symphonie_2_un_bal_vals submitted job 211569 submitting job: python3 ../../scripts/rosa_fft.py -f ../../data/symphonie_fantastique/berlioz_symphonie_2_un_bal_vals.mp3 -i 110 -o output/berlioz_symphonie_2_un_bal_vals submitted job 211570 submitting job: python3 ../../scripts/rosa_fft.py -f ../../data/symphonie_fantastique/berlioz_symphonie_2_un_bal_vals.mp3 -i 109 -o output/berlioz_symphonie_2_un_bal_vals submitted job 211571 job 211569 completed job 211570 completed job 211571 completed submitting job: ffmpeg -threads 4 -framerate 60 -start_number 6540 -i output/berlioz_symphonie_2_un_bal_vals/images/frame%d.png -ss 00:01:49 -to 00:01:52 -i ../../data/symphonie_fantastique/berlioz_symphonie_2_un_bal_vals.mp3 output/berlioz_symphonie_2_un_bal_vals/berlioz_symphonie_2_un_bal_vals.mp4 submitted job 216156 job 216156 completed deleted makeflow.failed.0 nothing left to do.
examples/clips/output /<song_name>/<song_name>.mp4/<song_name>/images/*.pngimages directory contains intermediate filesmakeflow --clean=intermediates examples/clip/01Daftendirekt.jx$ makeflow --clean=intermediates --jx clip_berlioz_symphonie_2_un_bal_vals.jx parsing clip_berlioz_symphonie_2_un_bal_vals.jx... local resources: 32 cores, 191863 MB memory, 1438503624 MB disk max running local jobs: 32 checking clip_berlioz_symphonie_2_un_bal_vals.jx for consistency... clip_berlioz_symphonie_2_un_bal_vals.jx has 4 rules. recovering from log file clip_berlioz_symphonie_2_un_bal_vals.jx.makeflowlog... checking for old running or failed jobs... checking files for unexpected changes... (use --skip-file-check to skip this step) cleaning filesystem... deleted output/berlioz_symphonie_2_un_bal_vals/images/frame6616.png deleted output/berlioz_symphonie_2_un_bal_vals/images/frame6578.png deleted output/berlioz_symphonie_2_un_bal_vals/images/frame6646.png deleted output/berlioz_symphonie_2_un_bal_vals/images/frame6559.png
projects/project1/job_array.slurm_template#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --mem-per-cpu-1GB
#SBATCH --time=00:10:00
#SBATCH --array=0-9
module load python ffmpeg libsndfile
# How many seconds to process per job
duration=4
time_index=$((SLURM_ARRAY_TASK_ID*duration))
song_file="XXX.mp3"
out_dir="output/XXX"
echo "Processing time_index:${time_index}"
python3 ./scripts/rosa_fft.py -f ${song_file} -i ${time_index} -t ${duration} -o ${out_dir}
sed 's/XXX/cool_song/g' job_array.slurm_template will find every XXX and replace with cool_songexamples/job_array as neededProject was based on an earlier version of this presentation, start point not working right now :(
projects/project2/generate_makefile.pymakeflow -T wq projects/project3/berlioz_symphonie_1_reveries_pa.makeflow -p $PORTexport PORT=8888$PORTparsing makeflows/Blitz_it.makeflow... local resources: 20 cores, 64334 MB memory, 5566556492 MB disk max running remote jobs: 1000 max running local jobs: 20 checking makeflows/Blitz_it.makeflow for consistency... makeflows/Blitz_it.makeflow has 72 rules. submitted job 43 submitting job: python3 ./scripts/rosa_fft.py -f data/chirpy/Blitz_it.wav -i 27 -o ./Blitz_it submitted job 44 submitting job: python3 ./scripts/rosa_fft.py -f data/chirpy/Blitz_it.wav -i 26 -o ./Blitz_it submitted job 45 submitting job: python3 ./scripts/rosa_fft.py -f data/chirpy/Blitz_it.wav -i 25 -o ./Blitz_it submitted job 46 submitting job: python3 ./scripts/rosa_fft.py -f data/chirpy/Blitz_it.wav -i 24 -o ./Blitz_it submitted job 47 submitting job: python3 ./scripts/rosa_fft.py -f data/chirpy/Blitz_it.wav -i 23 -o ./Blitz_it submitted job 48 submitting job: python3 ./scripts/rosa_fft.py -f data/chirpy/Blitz_it.wav -i 22 -o ./Blitz_it submitted job 49 submitting job: python3 ./scripts/rosa_fft.py -f data/chirpy/Blitz_it.wav -i 21 -o ./Blitz_it submitted job 50 submitting job: python3 ./scripts/rosa_fft.py -f data/chirpy/Blitz_it.wav -i 20 -o ./Blitz_it
srun work_queue_worker $HOSTNAME:$PORT projects/project3/berlioz_symphonie_1_reveries_pa.makeflow$HOSTNAME must be the hostname where the job manager process is running$PORT must be same port that manager is listening forwork_queue_worker: creating workspace /scratch/csul/makeflow/example/worker-268648-412 work_queue_worker: creating workspace /scratch/csul/makeflow/example/worker-268648-416 work_queue_worker: creating workspace /scratch/csul/makeflow/example/worker-268648-406 work_queue_worker: creating workspace /scratch/csul/makeflow/example/worker-268648-404 work_queue_worker: using 24 cores, 193123 MB memory, 185757288 MB disk, 0 gpus connected to manager d06-15.hpc.usc.edu:8080 via local address 10.125.19.236:57664 work_queue_worker: using 24 cores, 193123 MB memory, 185757288 MB disk, 0 gpus connected to manager d06-15.hpc.usc.edu:8080 via local address 10.125.19.236:57668 work_queue_worker: using 24 cores, 193123 MB memory, 185757288 MB disk, 0 gpus connected to manager d06-15.hpc.usc.edu:8080 via local address 10.125.19.236:57672 work_queue_worker: using 24 cores, 193123 MB memory, 185757284 MB disk, 0 gpus connected to manager d06-15.hpc.usc.edu:8080 via local address 10.125.19.234:59316