Logo-amall

Is there a suggested approach for dependency management, tests and CI? We're using sklearn integration, and have unit tests that import steps that use sklearn. In CI we currently install requirements that are not managed by zenml integrations, hence sklearn is not present in that environment. The options I see • Duplicate sklearn -- both in regular requirements and in zenml integrations -- so it's installed within CI (but that seems like a potential version conflict?) • Install zenml integrations (or only sklearn integration) within CI -- additional steps in CI Is there a standard way? Opinions?

Last active 14 days ago

7 replies

5 views

  • MI

    Is there a suggested approach for dependency management, tests and CI? We're using sklearn integration, and have unit tests that import steps that use sklearn. In CI we currently install requirements that are not managed by zenml integrations, hence sklearn is not present in that environment. The options I see
    • Duplicate sklearn -- both in regular requirements and in zenml integrations -- so it's installed within CI (but that seems like a potential version conflict?)
    • Install zenml integrations (or only sklearn integration) within CI -- additional steps in CI
    Is there a standard way? Opinions?

  • JA

    cannot you simply list zenml as an install/runtime requirement for your project and zenml[sklearn] (or whatever) as a dev/test requirement (either using test_requires if you are using setup.py or in the pyproject.toml file using something like

    test = ["zenml[sklearn]"]```
    or if using poetry with
    

    [tool.poetry.group.test.dependencies]
    zenml = {extras: ["sklearn"]}```
    (or using whatever format you use for your dependency manager)

  • MI

    Would extras and zenml integration install (the way used on local dev machine to install integrations) install the same version of sklearn?

  • MI

    If so, that's neat, works for this case I think

  • FE

    I believe zenml[sklearn] syntax doesn’t work as expected yet. I just tried, and for me, it only installed ZenML itself but not the sklearn integration. But good point, perhaps we should enable that syntax.

    At the moment, running zenml integration install in your CI is the best way I would say. That’s also what we do in our own CI.

    E.g.:
    zenml integration install sklearn -y

  • JA

    Ah, you are right, the only extra that seems to be enabled is server. It would be convenient to have the extras for the integrations defined as well to avoid messing up dependencies with different management systems (eg pip vs poetry vs conda)

  • ST

    To give you an alternative, you could also try extracting the requirements from zenml with zenml integration export-requirements sklearn in a requirements-extra.txt file and maintain/use that in your CI. Dependency management is a known pain with zenml integrations, especially when it leads to conflicts, one that we're working on improving.

Last active 14 days ago

7 replies

5 views