sim_db for Python

Minimal Example using Python

A parameter file called params_mininal_python_example.txt is located in the sim_db/examples/ directory in the source code. The file contains the following:

name (string): minimal_python_example

run_command (string): python root/examples/minimal_example.py

param1 (string): "Minimal Python example is running."

param2 (int): 42

A python script called minimal_example.py and is found in the same directory:

import sim_db # 'sim_db/src/' have been include in the path.

# Open database and write some initial metadata to database.
sim_database = sim_db.SimDB()

# Read parameters from database.
param1 = sim_database.read("param1") # String
param2 = sim_database.read("param2") # Integer

# Print param1 just to show that the example is running.
print(param1)

# Write final metadata to database and close connection.
sim_database.close()

Add the those simulations parameters to the sim_db database and run the simulation from the sim_db/examples/ directory with:

$ sim_db add_and_run -f params_minimal_python_example.txt

Extensive Example using Python

A parameter file called params_extensive_python_example.txt is found in the sim_db/examples/ directory in the source code. This parameter file contains all the possible types available in addition to some comments:

This is a comment, as any line without a colon is a comment.
# Adding a hashtag to the start of a comment line, make the comment easier to recognize.

# The name parameter is highly recommended to include.
name (string): extensive_python_example

# It is also recommended to include a description to further explain the intention of 
# the simulation.
description (string): Extensive Python example to demonstrate most features in sim_db.

run_command (string): python root/examples/extensive_example.py

# A parameter is added for each of the avaiable types.
param1_extensive (int): 3
param2_extensive (float): -0.5e10
param3_extensive (string): "Extensive Python example is running."
param4_extensive (bool): True
param5_extensive (int array): [1, 2, 3]
param6_extensive (float array): [1.5, 2.5, 3.5]
param7_extensive (string array): ["a", "b", "c"]
param8_extensive (bool array): [True, False, True]

# Include parameters from another parameter file.
include_parameter_file: root/examples/extra_params_example.txt

# Change a parameter value from the included parameter file to demonstrate that
# it is the last parameter value that count for a given parameter name. 
extra_param1 (int): 9

The line in the parameter file starting with include_parameter_file: will be substituted with the contain of the specified extra_params_example.txt file, found in the same directory:

# Extra parameters included in the extensive examples.

extra_param1 (int): 7
extra_param2 (string): "Extra params added."
extra_param3 (bool): False

extensive_example.py is also found in the same directory:

import sim_db # 'sim_db/' have been included in the path.

# Open database and write some initial metadata to database.
sim_database = sim_db.SimDB()

# Read parameters from database.
param1 = sim_database.read("param1_extensive") # Integer
param2 = sim_database.read("param2_extensive") # Float
param3 = sim_database.read("param3_extensive") # String 
param4 = sim_database.read("param4_extensive") # Bool 
param5 = sim_database.read("param5_extensive") # List of integers 
param6 = sim_database.read("param6_extensive") # List of floats 
param7 = sim_database.read("param7_extensive") # List of strings 
param8 = sim_database.read("param8_extensive") # List of bools 

# Demonstrate that the simulation is running.
print(param3)

# Write to database.
sim_database.write("example_result_1", param1, type_of_value="int")
sim_database.write("example_result_2", param2, type_of_value="float")
sim_database.write("example_result_3", param3, type_of_value="string")
sim_database.write("example_result_4", param4, type_of_value="bool")
sim_database.write("example_result_5", param5, type_of_value="int array")
sim_database.write("example_result_6", param6, type_of_value="float array")
sim_database.write("example_result_7", param7, type_of_value="string array")
sim_database.write("example_result_8", param8, type_of_value="bool array")

# Make unique subdirectory for storing results and write its name to database.
results = np.array(param6)
name_results_dir = sim_database.unique_results_dir("root/examples/results")
np.savetxt(name_results_dir + "/results.txt", results)

# Check if column exists in database.
is_column_in_database = sim_database.column_exists("column_not_in_database")

# Check is column is empty and then set it to empty.
sim_database.is_empty("example_result_1")
sim_database.set_empty("example_result_1")

# Get the 'ID' of the connected simulation and the path to the root directory.
db_id = sim_database.get_id()
path_proj_root = sim_database.get_path_proj_root()

# Write final metadata to the database and close the connection.
sim_database.close()

# Add an empty simulation to database, open connection and write to it.
sim_database_2 = sim_db.add_empty_sim(False)
sim_database_2.write("param1_extensive", 7, type_of_value="int")

# Delete simulation from the database.
sim_database_2.delete_from_database()

# Close connection to the database.
sim_database_2.close()

Add the those simulations parameters to the sim_db database and run the simulation from the sim_db/examples/ directory with:

$ sdb add_and_run -f params_extensive_python_example.txt

Python API Referance

Read and write parameters, results and metadata to the ‘sim_db’ database.

class SimDB(store_metadata=True, db_id=None, rank=None, only_write_on_rank=0)

To interact with the sim_db database.

For an actuall simulation it should be initialised at the very start of the simulation (with ‘store_metadata’ set to True) and closed with close() at the very end of the simulation. This must be done to add the corrrect metadata.

For multithreading/multiprocessing each thread/process MUST have its own connection (instance of this class) and MUST provide it with its rank.

__init__(store_metadata=True, db_id=None, rank=None, only_write_on_rank=0)

Connect to the sim_db database.

Parameters
  • store_metadata (bool) – If False, no metadata is added to the database. Typically used when postprocessing (visualizing) data from a simulation.

  • db_id (int) – ID number of the simulation parameters in the sim_db database. If it is ‘None’, then it is read from the argument passed to the program after option ‘–id’.

  • rank (int) – Number identifing the calling process and/or thread. (Typically the MPI or OpenMP rank.) If provided, only the ‘rank’ matching ‘only_write_on_rank’ will write to the database to avoid too much concurrent writing to the database. Single process and threaded programs may ignore this, while multithreading/multiprocessing programs need to provide it.

  • only_write_on_rank (int) – Number identifing the only process/thread that will write to the database. Only used if ‘rank’ is provided.

read(column, check_type_is='')

Read parameter in ‘column’ from the database.

Return None if parameter is empty.

Parameters
  • column (str) – Name of the column the parameter is read from.

  • check_type_is – Throws ValueError if type does not match ‘check_type_is’.The valid types the strings ‘int’, ‘float’, ‘bool’, ‘string’ and ‘int/float/bool/string array’ or the types int, float, bool, str and list.

Raises
  • ColumnError – If column do not exists.

  • ValueError – If return type does not match ‘check_type_is’.

  • sqlite3.OperationalError – Waited more than 5 seconds to read from the database, because other threads/processes are busy writing to it. Way too much concurrent writing is done and it indicates an design error in the user program.

write(column, value, type_of_value='', only_if_empty=False)

Write value to ‘column’ in the database.

If ‘column’ does not exists, a new is added.

If value is None and type_of_value is not set, the entry under ‘column’ is set to empty.

For multithreaded and multiprocess programs only a single will process/thread write to the database to avoid too much concurrent writing to the database. This is as long as the ‘rank’ was passed to SimDB under initialisation.

Parameters
  • column (str) – Name of the column the parameter is read from.

  • value – New value of the specified entry in the database.

  • type_of_value (str or type) – Needed if column does note exists or if value is empty list. The valid types the strings ‘int’, ‘float’, ‘bool’, ‘string’ and ‘int/float/bool/string array’ or the types int, float, bool and str.

  • only_if_empty (bool) – If True, it will only write to the database if the simulation’s entry under ‘column’ is empty.

Raises

ValueError – If column exists, but type does not match, or empty list is passed without type_of_value given.

unique_results_dir(path_directory)

Get path to subdirectory in ‘path_directory’ unique to simulation.

The subdirectory will be named ‘date_time_name_id’ and is intended to store results in. If ‘results_dir’ in the database is empty, a new and unique directory is created and the path stored in ‘results_dir’. Otherwise the path in ‘results_dir’ is just returned.

Parameters

path_directory (str) – Path to directory of which to make a subdirectory. If ‘path_directory’ starts with ‘root/’, that part will be replaced by the full path of the root directory of the project.

Returns

Full path to new subdirectory.

Return type

str

column_exists(column)

Return True if column is a column in the database.

Raises

sqlite3.OperationalError – Waited more than 5 seconds to read from the database, because other threads/processes are busy writing to it. Way too much concurrent writing is done and it indicates an design error in the user program.

is_empty(column)

Return True if entry in the database under ‘column’ is empty.

Raises

sqlite3.OperationalError – Waited more than 5 seconds to read from the database, because other threads/processes are busy writing to it. Way too much concurrent writing is done and it indicates an design error in the user program.

set_empty(column)

Set entry under ‘column’ in the database to empty.

get_id()

Return ‘ID’ of the connected simulation.

get_path_proj_root()

Return the path to the root directory of the project.

The project’s root directory is assumed to be where the ‘.sim_db/’ directory is located.

update_sha1_executables(paths_executables)

Update the ‘sha1_executable’ column in the database.

Sets the entry to the sha1 of all the executables. The order will affect the value.

Parameters

paths_executables ([str]) – List of full paths to executables.

Raises

sqlite3.OperationalError – Waited more than 5 seconds to write to the database, because other threads/processes are busy writing to it. Way too much concurrent writing is done and it indicates an design error in the user program.

delete_from_database()

Delete simulation from database.

Raises

sqlite3.OperationalError – Waited more than 5 seconds to write to the database, because other threads/processes are busy writing to it. Way too much concurrent writing is done and it indicates an design error in the user program.

close()

Closes connection to sim_db database and add metadata.

add_empty_sim(store_metadata=False)

Add an empty entry into the database and SimDB connected to it.

Parameters

store_metadata (bool) – If False, no metadata is added to the database. Typically used when postprocessing (visualizing) data from a simulation.