getml.data.Subset dataclass
Subset(
container_id: str,
peripheral: Dict[str, Union[DataFrame, View]],
population: Union[DataFrame, View],
)
A Subset consists of a population table and one or several peripheral tables.
It is passed by a Container
, StarSchema
and TimeSeries
to the Pipeline
.
ATTRIBUTE | DESCRIPTION |
---|---|
container_id | The ID of the container the subset belongs to. TYPE: |
peripheral | A dictionary containing the peripheral tables. |
population | The population table. |
Example
container = getml.data.Container(
train=population_train,
test=population_test
)
container.add(
meta=meta,
order=order,
trans=trans
)
# train and test are Subsets.
# They contain population_train
# and population_test respectively,
# as well as their peripheral tables
# meta, order and trans.
my_pipeline.fit(container.train)
my_pipeline.score(container.test)