最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - Dagster: preserve multiple sklearn model in asset - Stack Overflow

programmeradmin0浏览0评论

I have a linear regression asset in dagster that uses data previously computed and sklearn LinearRegression (Python 10 here).

For each of my input columns (that represents a country, I want to fit a Linear Regression model.

Everything works fine. My question is about outputing these models (or use dagster metadata?)

I want basically a train asset and a forecast asset, a for this I want to return the models trained in the train asset and load them in the forecast asset. Solution could be to save them locally but I want to use dagster exclusively.

Also, I would like to save plenty of metadata (score, rmse) of each model into the train asset metadata.

Here is my code:

@asset(deps=[])
def train_linear_regression(duckdb: DuckDBResource):
    """Use pivot table with time serie data to forecast.
    Used Linear Regression.
    """
    # Setting up query.
    query = "SELECT * FROM pivot_table_model"
    # Execute the query.
    with duckdb.get_connection() as conn:
        df = conn.execute(query).df()
    output = {}
    for country_name in df.drop(columns=["year"]).columns:
        # Setting Y.
        Y = df.loc[:, country_name] # Retrieving population - pd.Series.
        # Preparing linear model.
        linear_regression = LinearRegression()
        # Fitting the model.
        linear_regression.fit(X, Y)
        # Scoring.
        score = linear_regression.score(X, Y)
        output[country_name] = {
            "model": linear_regression, 
            "score": float(score),
            "plot": generate_plot(df),
        }

What should I return?

发布评论

评论列表(0)

  1. 暂无评论