Components
Component basics
There are a few basic components that underpin rail_pz_service functionality.
Generating a Request requires CatalogTag that specifies which columns to expect in the Dataset that the request in being run on, and an Estimator to run the analysis using a Model trained for the Algorithm that underlies that Estimator running the request, and which is compatible with the CatalogTag of the Dataset.
Request
- class rail_pz_service.db.Request(**kwargs)[source]
Basic processing unit in rail_pz_service. A Request to generate per-galaxy p(z) for all of the object in a particular Dataset using specific Estimator.
The output p(z) distribution will be stored in a qp file.
This also store some metadata including timestamps and the user who intiated the Request.
- pydantic_mode_class
alias of
Request
- id: Mapped[int]
primary key
- user: Mapped[str]
User who orginated this Request
- estimator_id: Mapped[int]
foreign key into estimator table
- dataset_id: Mapped[int]
foreign key into dataset table
- qp_file_path: Mapped[str | None]
path to the output file
- time_created: Mapped[datetime]
timestamp of when the request was created in the DB
- time_started: Mapped[datetime | None]
timestamp of when the request processing started by an Estimator
- time_finished: Mapped[datetime | None]
timestamp of when the request processing was finished
- estimator_: Mapped[Estimator]
Access to associated Estimator
- dataset_: Mapped[Dataset]
Access to associated Dataset
- col_names_for_table = ['id', 'user', 'estimator_id', 'dataset_id', 'qp_file_path']
column names to use when printing the table
Dataset
- class rail_pz_service.db.Dataset(**kwargs)[source]
Color data about set of objects that can be used to obtain p(z) esimates.
It is asscoated with a CatalogTag that defines which columns names to expect.
It can either be stored in a file (for larger datasets) or as a python dict (for small datasets of a few objects, useful when uploading things on the fly
- pydantic_mode_class
alias of
Dataset
- id: Mapped[int]
primary key
- name: Mapped[str]
Name for this Dataset, unique
- n_objects: Mapped[int]
Number of objects in the dataset
- path: Mapped[str | None]
Path to the relevant file (could be None)
- data: Mapped[dict | None]
Data for the dataset (could be None)
- catalog_tag_id: Mapped[int]
foreign key into catalog_tag table
- catalog_tag_: Mapped[CatalogTag]
Access to associated CatalogTag
- requests_: Mapped[list[Request]]
Access to list of associated Request
- col_names_for_table = ['id', 'name', 'n_objects', 'catalog_tag_id', 'path']
column names to use when printing the table
- classmethod validate_data_for_path(path, catalog_tag)[source]
Validate that these data are appropriate for the CatalogTag
- Parameters:
path (Path) – File with the data
catalog_tag (CatalogTag) – CatalogTag in question
- Returns:
Size of the datset
- Return type:
int
- classmethod validate_data(data, catalog_tag)[source]
Validate that these data are appropriate for the CatalogTag
- Parameters:
data (dict) – Data in question
catalog_tag (CatalogTag) – Catalog tab in question
- Returns:
Size of the datset, data formatted as strings
- Return type:
tuple[int, dict[str, list[float]]]
Estimator
- class rail_pz_service.db.Estimator(**kwargs)[source]
Combinination of an Algorithm to run a trained Model to apply to the data, and any specific configuration overrides.
- pydantic_mode_class
alias of
Estimator
- id: Mapped[int]
primary key
- name: Mapped[str]
Name of the model, unique
- algo_id: Mapped[int]
foreign key into ‘Algorithm’ table
- catalog_tag_id: Mapped[int]
foreign key into ‘CatalogTag’ table
- model_id: Mapped[int]
foreign key into ‘Model’ table
- config: Mapped[dict | None]
Configuration parameters for this estimator
- algo_: Mapped[Algorithm]
Access to associated Algorithm
- catalog_tag_: Mapped[CatalogTag]
Access to associated CatalogTag
- model_: Mapped[Model]
Access to associated Model
- requests_: Mapped[list[Request]]
Access to list of associated Request
- col_names_for_table = ['id', 'name', 'algo_id', 'catalog_tag_id', 'model_id']
column names to use when printing the table
Model
- class rail_pz_service.db.Model(**kwargs)[source]
Specific ML model that is trained to work with a specific Algorithm. On a particular type of data (CatalogTag)
Typically a Model is stored as a pickle file.
The rail.core.model.Model class provides a standard wrapper to store meta data such as the name of the python class that created the model, and the applicable CatalogTag to use the model with.
- pydantic_mode_class
alias of
Model
- id: Mapped[int]
primary key
- name: Mapped[str]
Name for this Model, unique
- path: Mapped[str]
Path to the relevant file
- algo_id: Mapped[int]
foreign key into Algorithm table
- catalog_tag_id: Mapped[int]
foreign key into CatalogTag table
- algo_: Mapped[Algorithm]
Access to associated Algorithm
- catalog_tag_: Mapped[CatalogTag]
Access to associated CatalogTag
- estimators_: Mapped[list[Estimator]]
Access to list of associated Estimator
- col_names_for_table = ['id', 'name', 'algo_id', 'catalog_tag_id', 'path']
column names to use when printing the table
- classmethod validate_model(path, algo, catalog_tag)[source]
Validate that the model is appropriate for the Algorithm and CatalogTag
- Parameters:
path (Path) – File with the data
algo (Algorithm) – Algorithm in question
catalog_tag (CatalogTag) – Catalog tag in question
- Return type:
None
Algorithm
- class rail_pz_service.db.Algorithm(**kwargs)[source]
Algorithm is wrapper for a specific RAIL class that implements a particular p(z) estimation algorithm.
This just defines the particular python class implementing the algorithm. The selection of a particular instance of the training Model and any non-default a parameters used to initialze an Estimator are handled in their own classes.
- pydantic_mode_class
alias of
Algorithm
- id: Mapped[int]
primary key
- name: Mapped[str]
Name for this Algorithm, unique
- class_name: Mapped[str]
Name for the python class implementing the algorithm
- estimators_: Mapped[list[Estimator]]
Access to list of associated Estimator
- models_: Mapped[list[Model]]
Access to list of associated Model
- col_names_for_table = ['id', 'name', 'class_name']
column names to use when printing the table
CatalogTag
- class rail_pz_service.db.CatalogTag(**kwargs)[source]
Defines what kind of catalog we are analyzing data from. Specifically what to expect for the names of the magnitude columns.
This is implemented in the rail.utils.catalog_utils module, which uses a catalog tag to set the default parameters for RAIL modules to match the catalog.
- pydantic_mode_class
alias of
CatalogTag
- id: Mapped[int]
primary key
- name: Mapped[str]
Name for this CatalogTag, unique
- class_name: Mapped[str]
Name for the python class implementing the CatalogTag
- estimators_: Mapped[list[Estimator]]
Access to list of associated Estimator
- models_: Mapped[list[Model]]
Access to list of associated Model
- datasets_: Mapped[list[Dataset]]
Access to list of associated Dataset
- col_names_for_table = ['id', 'name', 'class_name']
column names to use when printing the table