add gcp storage to xgboost-operator by xfate123 · Pull Request #81 · kubeflow/xgboost-operator

xfate123 · 2020-05-16T19:00:02Z

Think about adding a another storage option for our xgboost-operator. Still working on it.

kubeflow-bot · 2020-05-16T19:00:07Z

This change is

k8s-ci-robot · 2020-05-16T19:00:12Z

Hi @xfate123. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2020-05-16T19:00:17Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign johnugeorge
You can assign the PR to them by writing /assign @johnugeorge in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

draft updated. Appreciate further review

merlintang · 2020-05-16T19:56:23Z

config/samples/xgboost-dist/utils.py

+                                      'feature_importance.json')
+
+    gcp_path = gcp_parameters['path']
+    logger.info('---- export model ----')


export model to GCP ?

Yes, it's to GCP

merlintang

also update YAML, and the readme to help user to use as well.

merlintang · 2020-05-16T19:57:17Z

config/samples/xgboost-dist/utils.py

+    fscore_dict = booster.get_fscore()
+    with open(feature_importance, 'w') as file:
+        file.write(json.dumps(fscore_dict))
+        logger.info('---- chief dump model successfully!')


dump model to local ?

I learnt it from dump to oss module, I think the logic is dump the model to local first, and then upload from local to the cloud

merlintang · 2020-05-16T19:58:16Z

config/samples/xgboost-dist/utils.py

+        upload_gcp(gcp_parameters, model_fname, aux_path)
+        upload_gcp(gcp_parameters, text_model_fname, aux_path)
+        upload_gcp(gcp_parameters, feature_importance, aux_path)
+    else:


add the log to say that this model is updated success?

…_v1alpha1_iris_predict_oss.yaml

…s_train_oss.yaml

…tjob_v1alpha1_iris_predict_gcp.yaml

…ob_v1alpha1_iris_train_gcp.yaml

xfate123 · 2020-05-16T21:25:33Z

@merlintang update the README for user's convenience. And also specify the yaml for oss user and gcp user. Appreciate for further review

merlintang · 2020-05-16T22:37:28Z

config/samples/xgboost-dist/README.md

 Similarly, xgboostjob_v1alpha1_iris_predict.yaml is used to configure XGBoost job batch prediction.

+**Configure GCP parameter**
+For training jobs in GCP , you could configure xgboostjob_v1alpha1_iris_train.yaml and xgboostjob_v1alpha1_iris_predict.yaml


the yaml file name is correct.

merlintang · 2020-05-16T22:37:46Z

config/samples/xgboost-dist/README.md

+For training jobs in GCP , you could configure xgboostjob_v1alpha1_iris_train.yaml and xgboostjob_v1alpha1_iris_predict.yaml
+Note, we use [GCP](https://cloud.google.com/) to store the trained model,
+thus, you need to specify the GCP parameter in the yaml file. Therefore, remember to fill the GCP parameter in xgboostjob_v1alpha1_iris_train.yaml and xgboostjob_v1alpha1_iris_predict.yaml file.
+The oss parameter includes the account information such as type, client_id, client_email,private_key_id,private_key and access_bucket.


merlintang · 2020-05-16T22:42:04Z

config/samples/xgboost-dist/xgboostjob_v1alpha1_iris_train_gcp.yaml

+        spec:
+          containers:
+          - name: xgboostjob
+            image: docker.io/merlintang/xgboost-dist-iris:1.1


the image name is not correct, you need to build the new image withe new code.

for sure, thanks for your advice

just double check, you mean build a new image with new python code and update the new image to all yaml files in this folder.
Do I understand correct?

merlintang · 2020-05-16T23:21:26Z

yeal

…

On May 16, 2020, at 4:14 PM, xfate123 ***@***.***> wrote: @xfate123 commented on this pull request. In config/samples/xgboost-dist/xgboostjob_v1alpha1_iris_train_gcp.yaml: > +apiVersion: "xgboostjob.kubeflow.org/v1alpha1" +kind: "XGBoostJob" +metadata: + name: "xgboost-dist-iris-test-train-gcp" +spec: + xgbReplicaSpecs: + Master: + replicas: 1 + restartPolicy: Never + template: + apiVersion: v1 + kind: Pod + spec: + containers: + - name: xgboostjob + image: docker.io/merlintang/xgboost-dist-iris:1.1 just double check, you mean build a new image with new python code and update the new image to all yaml files in this folder. Do I understand correct? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

xfate123 · 2020-05-17T05:16:45Z

@merlintang already created the new image and update it to all the yaml file. still need further testing.
And I am thinking about a better way for user to add argument, then we don't have to have so many different yaml file for different storage-type.

merlintang · 2020-05-17T17:21:37Z

change the PR title, you still have the work in progress.

terrytangyuan · 2020-05-17T18:37:20Z

config/samples/xgboost-dist/xgboostjob_v1alpha1_iris_predict_gcp.yaml

+             - --job_type=Predict
+             - --model_path=autoAI/xgb-opt/2
+             - --model_storage_type=gcp
+             - --gcp_param=unknown


Why unknown here?

terrytangyuan · 2020-05-17T18:37:39Z

config/samples/xgboost-dist/xgboostjob_v1alpha1_iris_predict_gcp.yaml

+        spec:
+          containers:
+          - name: xgboostjob
+            image: docker.io/xfate123/xgboost-dist-iris:1.1


Use a local image

terrytangyuan · 2020-05-17T18:37:52Z

config/samples/xgboost-dist/xgboostjob_v1alpha1_iris_predict_gcp.yaml

+        spec:
+          containers:
+          - name: xgboostjob
+            image: docker.io/xfate123/xgboost-dist-iris:1.1


terrytangyuan · 2020-05-17T18:38:08Z

config/samples/xgboost-dist/xgboostjob_v1alpha1_iris_predict_gcp.yaml

+            imagePullPolicy: Always
+            args:
+              - --job_type=Predict
+              - --model_path=autoAI/xgb-opt/2


can we simplify the model path?

terrytangyuan · 2020-05-17T18:38:58Z

config/samples/xgboost-dist/xgboostjob_v1alpha1_iris_predict_local.yaml

          containers:
          - name: xgboostjob
-            image: docker.io/merlintang/xgboost-dist-iris:1.1
+            image: docker.io/xfate123/xgboost-dist-iris:1.1


Do we have a Dockerfile for this image in this repo?

We only have image in this repo

add gcp storage to xgboost-operator, still working on in

2240810

k8s-ci-robot added needs-ok-to-test size/M labels May 16, 2020

k8s-ci-robot requested review from johnugeorge and richardsliu May 16, 2020 19:00

Update utils.py

ba00fb4

draft updated. Appreciate further review

k8s-ci-robot added size/L and removed size/M labels May 16, 2020

merlintang reviewed May 16, 2020

View reviewed changes

xfate123 added 16 commits May 16, 2020 13:09

Update utils.py

f3d8619

Update xgboostjob_v1alpha1_iris_predict.yaml

d904b9c

Update and rename xgboostjob_v1alpha1_iris_predict.yaml to xgboostjob…

813e3cc

…_v1alpha1_iris_predict_oss.yaml

Rename xgboostjob_v1alpha1_iris_train.yaml to xgboostjob_v1alpha1_iri…

4056365

…s_train_oss.yaml

Create xgboostjob_v1alpha1_iris_train_gcr.yaml

bae957b

Create xgboostjob_v1alpha1_iris_predict_gcr.yaml

b583c1b

Update and rename xgboostjob_v1alpha1_iris_predict_gcr.yaml to xgboos…

758ec4e

…tjob_v1alpha1_iris_predict_gcp.yaml

Update and rename xgboostjob_v1alpha1_iris_train_gcr.yaml to xgboostj…

4125851

…ob_v1alpha1_iris_train_gcp.yaml

Update README.md

9a0e655

Update xgboostjob_v1alpha1_iris_train_gcp.yaml

a2a1702

Update xgboostjob_v1alpha1_iris_predict_gcp.yaml

05675a1

Update xgboostjob_v1alpha1_iris_predict_oss.yaml

dc71d6b

Update xgboostjob_v1alpha1_iris_train_oss.yaml

cf309d8

Update README.md

ead8563

Update README.md

fcf83ec

Update README.md

ef7a7d0

merlintang reviewed May 16, 2020

View reviewed changes

Update README.md

9b5d214

xfate123 added 13 commits May 16, 2020 17:41

Update utils.py

309eee4

Update utils.py

fb48969

Update requirements.txt

af63ce3

Update requirements.txt

ca9bed0

Update utils.py

0c22468

Update requirements.txt

1db400c

Update requirements.txt

ca55228

Update xgboostjob_v1alpha1_iris_predict_gcp.yaml

2490274

Update xgboostjob_v1alpha1_iris_predict_local.yaml

eeb1049

Update xgboostjob_v1alpha1_iris_predict_oss.yaml

fc7543f

Update xgboostjob_v1alpha1_iris_train_gcp.yaml

8d6cf3c

Update xgboostjob_v1alpha1_iris_train_local.yaml

341bd49

Update xgboostjob_v1alpha1_iris_train_oss.yaml

c8185e7

xfate123 changed the title ~~add gcp storage to xgboost-operator, still working on in~~ add gcp storage to xgboost-operator[WIP] May 17, 2020

terrytangyuan reviewed May 17, 2020

View reviewed changes

xfate123 added 3 commits May 17, 2020 12:30

Update utils.py

925e26f

Update main.py

a313d2a

Update utils.py

cf82e5a

xfate123 changed the title ~~add gcp storage to xgboost-operator[WIP]~~ add gcp storage to xgboost-operator May 17, 2020

xfate123 added 2 commits May 17, 2020 14:54

Update utils.py

e57465a

Update README.md

06d2992

Comments

Conversation

xfate123 commented May 16, 2020

Uh oh!

kubeflow-bot commented May 16, 2020

Uh oh!

k8s-ci-robot commented May 16, 2020

Uh oh!

k8s-ci-robot commented May 16, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

merlintang left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xfate123 commented May 16, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

merlintang commented May 16, 2020 via email

Uh oh!

xfate123 commented May 17, 2020

Uh oh!

merlintang commented May 17, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

terrytangyuan May 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

terrytangyuan May 17, 2020 •

edited

Loading