Skip to content
This repository was archived by the owner on Jul 13, 2025. It is now read-only.

Milestones

List view

  • ## Intro Well, the promised platform and back-office to providing the data never happened. I did many of that via manual labor, writing hundreds of small scripts, and so on, but no one did enough to provide what was needed. So are stuck with either no data for some parameters (#112, and #113), or bad data for the others (#156). This milestone includes all issues regarding full replacement of all parameters/questions/features that hadn't had enough data or bad data. For each feature, the following needs to be done: - extracting domain expert importance values - deciding the order they should come - manually - via importance value (equivalent to XAI) - some modification for `seduce`ing on the effects of each variable (use notebook for changing the coefficient of the effect) - question and answers title and fun aspect

    No due date
    8/8 issues closed
  • ## Goal Data to vizard comes from isolated data chunks and are integrate using scripts I made for ETL pipeline. In near future, Visaland CRM is going to be used as the main app, and hence all the data required for Vizard comes through Visaland CRM. Now, the plan is to integrate the Visaland CRM API to Vizard, so next time we plan to do ETL, do data analysis, or build machine learning models and provide insights, the data should come from the Visaland CRM. ## Tasks For this to happen, there are some basic tasks: - [ ] Making sure all required data (current and near-future) are presented in the CRM (database matches our current data models) - [ ] Validate data from API to fit into our preprocessing and transforming pipeline. For that, https://docs.greatexpectations.io/ seems great (see [use case](https://greatexpectations.io/case-studies/how-heineken-uses-gx-to-provide-instant-data-quality-validation-and-feedback)) - [ ] build and test the full (offline) pipeline of ingesting data from CRM API and processing it via Vizard ## Remarks - do not engage with stream (online) data processing - It would be nice if we could integrate Apache Airflow

    No due date
    1/3 issues closed
  • The goal is to containerize the whole `vizard` project, by following the best practices, such as: - separating dev/test/run environment and images - separating base, mlflow and API codes, each as a separate image (possibly) - persisting DBs and other data related issues via docker - optimized docker image in term of size, stage, caching, etc - proper documentation of build process and using the docker

    No due date
    4/4 issues closed