Friday, February 25, 2011

Update of SUN cluster running scripts

I have updated/written a few scripts for running on the SUN cluster at the CHPC. They are available in my work directory under "runscripts".

run_roms_l.moab
    • The previous run script waited for the tiles to be joined before timestepping of the next month could proceed. This holds up all the processors as this is a serial (1 CPU) job.
    • Changes: the tiles are renamed with the dates for joining later in the next script
    • Restart and history files are still joined as these files are small and quick to join and so you can monitor your model run.
join_files.moab
    • This takes the renamed tiles created above and joins them with ncjoin.
    • This is done in parallel so can be submitted to the SUN, else it can run in serial on your desktop by changing the number of parallel processes.
    • Careful of diagnostic files. I created a monster 28GB average file!!!!!
rename_files.sh or rename_files.moab
    • When the model crashes due to MPI error or something, the timestepping and the file naming become out of sync. This renames all the files with the correct dates after the run.
    • This is a serial procedure. The .moab is so you can submit it to the cluster.
I've tested writing, joining and renaming monthly average files. The scripts produce log and error files. I haven't tested for nesting. Read the description in the file headers.

Let me know if I missed something.
Nicolette

No comments:

Post a Comment