Tomcli

Tomcli

Strongly interested in Cloud, Containers, Serverless, and Machine Learning.

Member Since 6 years ago

@IBM - @CODAIT, San Francisco, CA

Experience Points
45
follower
Lessons Completed
6
follow
Lessons Completed
95
stars
Best Reply Awards
126
repos

626 contributions in the last year

Pinned
⚡ Kubeflow Pipelines on Tekton
⚡ Fabric for Deep Learning (FfDL, pronounced fiddle) is a Deep Learning Platform offering TensorFlow, Caffe, PyTorch etc. as a Service on Kubernetes
⚡ Machine Learning eXchange (MLX). Data and AI Assets Catalog and Execution Engine
⚡ Serverless Inferencing on Kubernetes
⚡ Machine Learning Toolkit for Kubernetes
⚡ Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Activity
Oct
28
20 hours ago
Activity icon
issue

Tomcli issue comment kubeflow/kfp-tekton

Tomcli
Tomcli

Add tag to frontend code with changes for Tekton

Signed-off-by: Andrew-Butler [email protected]

Activity icon
issue

Tomcli issue comment machine-learning-exchange/katalog

Tomcli
Tomcli

Update compiled pipeline katalog with Kubeflow 1.4

Right now our pipeline katalog is compiled with Kubeflow 1.3 (kfp-tekton 0.8) SDK. Although they can run on Kubeflow 1.4, so of the features like metadata tracking was not fully functioning because Kubeflow 1.4 has slightly different on the artifact name mapping.

Activity icon
issue

Tomcli issue machine-learning-exchange/katalog

Tomcli
Tomcli

Update compiled pipeline katalog with Kubeflow 1.4

Right now our pipeline katalog is compiled with Kubeflow 1.3 (kfp-tekton 0.8) SDK. Although they can run on Kubeflow 1.4, so of the features like metadata tracking was not fully functioning because Kubeflow 1.4 has slightly different on the artifact name mapping.

pull request

Tomcli pull request kubeflow/kfp-tekton

Tomcli
Tomcli

Fix a typo for the openshift docs

Which issue is resolved by this Pull Request: Resolves #

Description of your changes: There's a typo for deploying openshift scc. It should deploy with -k flag because our manifests are written in kustomize.

Environment tested:

  • Python Version (use python --version):
  • Tekton Version (use tkn version):
  • Kubernetes Version (use kubectl version):
  • OS (e.g. from /etc/os-release):

Checklist:

Activity icon
created branch

Tomcli in kubeflow/kfp-tekton create branch Tomcli-patch-1

createdAt 19 hours ago
Activity icon
issue

Tomcli issue kserve/website

Tomcli
Tomcli

Update AIF explainer example with the latest knative API.

Expected Behavior

Since KServe is updated with a later version of knative. Knative service v1alpha1 is not longer supported, so we need to update the AIF example to use the latest knative API.

Activity icon
issue

Tomcli issue comment kserve/kserve

Tomcli
Tomcli

Update AIF explainer example

What this PR does / why we need it: Update the AIF explainer example with the correct commands and fix the message dumper knative version.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #1886

Special notes for your reviewer:

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Release note:


pull request

Tomcli pull request kserve/website

Tomcli
Tomcli

Update AIF explainer example

"Fixes #1886

Proposed Changes

  • Update AIF explainer docs Since KServe is updated with a later version of knative. Knative service v1alpha1 is not longer supported, so we need to update the AIF example to use the latest knative API.
push

Tomcli push Tomcli/website-2

Tomcli
Tomcli

commit sha: 31ad5bbd3e598b34889f08301cbe4fde7621a44f

push time in 20 hours ago
Activity icon
fork

Tomcli forked kserve/website

⚡ User documentation for KServe.
Tomcli Apache License 2.0 Updated
fork time in 20 hours ago
Activity icon
issue

Tomcli issue comment kserve/kserve

Tomcli
Tomcli

Update AIF explainer example

What this PR does / why we need it: Update the AIF explainer example with the correct commands and fix the message dumper knative version.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #1886

Special notes for your reviewer:

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Release note:


Tomcli
Tomcli

thanks @yuzisun I will also add a PR to the website

pull request

Tomcli pull request kserve/kserve

Tomcli
Tomcli

Update AIF explainer example

What this PR does / why we need it: Update the AIF explainer example with the correct commands and fix the message dumper knative version.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #1886

Special notes for your reviewer:

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Release note:


push

Tomcli push Tomcli/kfserving

Tomcli
Tomcli

Update message-dumper to use knative v1

commit sha: dc7fe371222a8b35469134c018a7087e7642961d

push time in 20 hours ago
push

Tomcli push Tomcli/kfserving

Tomcli
Tomcli

Update README with the right payload json

commit sha: 31aa066c06a2c8b37b2bf7493d05eeb647db1b46

push time in 20 hours ago
Activity icon
issue

Tomcli issue kserve/kserve

Tomcli
Tomcli

Update AIF explainer example with the latest knative API.

/kind bug

What steps did you take and what happened: Since KServe is updated with a later version of knative. Knative service v1alpha1 is not longer supported, so we need to update the AIF example to use the latest knative API.

What did you expect to happen:

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

  • Istio Version:
  • Knative Version:
  • KFServing Version:
  • Kubeflow version:
  • Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
  • Minikube version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):
Oct
27
1 day ago
Activity icon
issue

Tomcli issue comment kubeflow/community

Tomcli
Tomcli

Archive inactive Kubeflow repos

There are multiple repos in the Kubeflow github organization where no code is being checked in. We shall identify and archive and/or move those repos. I will follow up with a subset of the list, and then we can expand from there.

Tomcli
Tomcli

https://github.com/kubeflow/kfp-tekton-backend can be archived since we migrated that repo to kfp-tekton already.

Activity icon
issue

Tomcli issue comment kserve/kserve

Tomcli
Tomcli

[kubeflow 1.3] unable to route requests to the kfserving pod due to auth policy

/kind bug

What steps did you take and what happened: deploy and inferenceserving example

I am installing in a preexisting knative deployment and perhaps I'm missing some steps. When I make a request, either using the external route/ingress or the internal service, the requests hangs for a long time and then times out. retracing the steps of the request, I ended up in the activator pod, and found this error message:

{"level":"error","ts":"2021-04-24T19:10:16.241Z","logger":"activator","caller":"net/revision_backends.go:322","msg":"Failed to probe clusterIP 172.30.171.229:80","knative.dev/controller":"activator","knative.dev/pod":"activator-886cd96fb-4gqq5","knative.dev/key":"raffa/flowers-sample-predictor-default-00002","error":"unexpected body: want \"queue\", got \"RBAC: access denied\"","stacktrace":"knative.dev/serving/pkg/activator/net.(*revisionWatcher).checkDests\n\t/opt/app-root/src/go/src/knative.dev/serving/pkg/activator/net/revision_backends.go:322\nknative.dev/serving/pkg/activator/net.(*revisionWatcher).run\n\t/opt/app-root/src/go/src/knative.dev/serving/pkg/activator/net/revision_backends.go:366"}
{"level":"warn","ts":"2021-04-24T19:10:16.440Z","logger":"activator","caller":"net/revision_backends.go:286","msg":"Failed probing pods","knative.dev/controller":"activator","knative.dev/pod":"activator-886cd96fb-4gqq5","knative.dev/key":"raffa/flowers-sample-predictor-default-00002","curDests":{"ready":"10.128.2.38:8012","notReady":""},"error":"unexpected body: want \"queue\", got \"RBAC: access denied\""}

so it looks like the activator pod is probing the kfserving pod for the length of the queue but it's getting an RBAC error, due to, presumably, this istio RBAC rule:

spec:
  rules:
    - when:
        - key: 'request.headers[kubeflow-userid]'
          values:
            - raffa
    - when:
        - key: source.namespace
          values:
            - raffa

this is a standard RBAC rule created by the kubeflow profile when using a multitenant deployment. I am not 100% sure that this is what is holding up the requests but it seems likely, because when I forge a request to go direclty to the kfserving pod I get a response.

am I missing something? This problem should be affecting any standard multitenant kubeflow deployment, how is it normally fixed?

What did you expect to happen: being able to route requests to the kfserving pod.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

  • Istio Version: 1.6.5
  • Knative Version: 0.19.0
  • KFServing Version: 1.3
  • Kubeflow version: 1.3
  • Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
  • Minikube version: na
  • Kubernetes version: (use kubectl version): 1.20
  • OS (e.g. from /etc/os-release):
Tomcli
Tomcli

We should add this AuthorizationPolicy to the Kubeflow manifests because Kubeflow 1.4 still has this issue.

open pull request

Tomcli wants to merge kubeflow/pipelines

Tomcli
Tomcli

WIP chore: kfserving -> kserve migration

KFServing to KServe migration

Checklist:

Tomcli
Tomcli

Maybe we should keep this section but points to the commit prior to this PR because Kubeflow 1.4 is still using KFServing.

pull request

Tomcli merge to kubeflow/pipelines

Tomcli
Tomcli

WIP chore: kfserving -> kserve migration

KFServing to KServe migration

Checklist:

Activity icon
issue

Tomcli issue comment kserve/kserve

Tomcli
Tomcli

Kubeflow pipeline integration for KServe

/kind feature

Describe the solution you'd like Create a new KServe component for Kubeflow Pipeline with newly released kserve SDK similar to KFServing component

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Activity icon
issue

Tomcli issue comment kubeflow/kfp-tekton

Tomcli
Tomcli

fix(sdk): "when" in some ParallelFor loops

Which issue is resolved by this Pull Request: Resolves #761

Description of your changes: As described in the issue:

map_cel_vars = lambda a: '$(tasks.%s.results.%s)' % (sanitize_k8s_name(a['value'].split('.')[-1]),
  sanitize_k8s_name(a['output_name'])) if a.get('type', '') == dsl.PipelineParam else a.get('value', '')

condition_refs[template['metadata']['name']] = [
  {
    'input': map_cel_vars(condition_operand1),
    'operator': conditionOp_mapping[condition_operator['value']],
    'values': [map_cel_vars(condition_operand2)]
  }
]

Uses 'value' field and tries to decompose it, instead of using 'op_name' directly. This PR changes that.

Environment tested:

  • Python Version (use python --version): 3.9.0
  • Tekton Version (use tkn version): irrelevant
  • Kubernetes Version (use kubectl version): irrelevant
  • OS (e.g. from /etc/os-release): irrelevant

Checklist:

Oct
26
2 days ago
pull request

Tomcli merge to kubeflow/kfp-tekton

Tomcli
Tomcli

fix(sdk): "when" in some ParallelFor loops

Which issue is resolved by this Pull Request: Resolves #61

Description of your changes: As described in the issue:

map_cel_vars = lambda a: '$(tasks.%s.results.%s)' % (sanitize_k8s_name(a['value'].split('.')[-1]),
  sanitize_k8s_name(a['output_name'])) if a.get('type', '') == dsl.PipelineParam else a.get('value', '')

condition_refs[template['metadata']['name']] = [
  {
    'input': map_cel_vars(condition_operand1),
    'operator': conditionOp_mapping[condition_operator['value']],
    'values': [map_cel_vars(condition_operand2)]
  }
]

Uses 'value' field and tries to decompose it, instead of using 'op_name' directly. This PR changes that.

Environment tested:

  • Python Version (use python --version): 3.9.0
  • Tekton Version (use tkn version): irrelevant
  • Kubernetes Version (use kubectl version): irrelevant
  • OS (e.g. from /etc/os-release): irrelevant

Checklist:

open pull request

Tomcli wants to merge kubeflow/kfp-tekton

Tomcli
Tomcli

fix(sdk): "when" in some ParallelFor loops

Which issue is resolved by this Pull Request: Resolves #61

Description of your changes: As described in the issue:

map_cel_vars = lambda a: '$(tasks.%s.results.%s)' % (sanitize_k8s_name(a['value'].split('.')[-1]),
  sanitize_k8s_name(a['output_name'])) if a.get('type', '') == dsl.PipelineParam else a.get('value', '')

condition_refs[template['metadata']['name']] = [
  {
    'input': map_cel_vars(condition_operand1),
    'operator': conditionOp_mapping[condition_operator['value']],
    'values': [map_cel_vars(condition_operand2)]
  }
]

Uses 'value' field and tries to decompose it, instead of using 'op_name' directly. This PR changes that.

Environment tested:

  • Python Version (use python --version): 3.9.0
  • Tekton Version (use tkn version): irrelevant
  • Kubernetes Version (use kubectl version): irrelevant
  • OS (e.g. from /etc/os-release): irrelevant

Checklist:

Tomcli
Tomcli

As part of our lint, we only allow each line in the python code to have maximum 140 charators. can you break this code into another line?

Functional wise it looks good to me

Oct
25
3 days ago
Activity icon
issue

Tomcli issue comment machine-learning-exchange/mlx

Tomcli
Tomcli

[API] Update to `elyra-server` package `3.2.1`

Changes:

  • Update to elyra-server==3.2.1 for running notebooks instead of kfp-notebook==0.26.0
  • Update pipeline template to use ExecuteFileOp instead of NotebookOp
  • Update build dependencies in Dockerfile since cryptography now requires Rust, which in Alpine images requires cargo
  • Enable anonymous read access to notebook requirements file before running notebooks, since Elyra pulls the requirements.txt file from Minio

Resolves #244

@Tomcli

Activity icon
delete

Tomcli in kubeflow/kfp-tekton delete branch Tomcli-patch-2

deleted time in 3 days ago
Activity icon
delete

Tomcli in kubeflow/kfp-tekton delete branch Tomcli-patch-1

deleted time in 3 days ago
Activity icon
issue

Tomcli issue comment kubeflow/kfp-tekton

Tomcli
Tomcli

Update trusted AI pipeline to have unique job names

Which issue is resolved by this Pull Request: Resolves #

Description of your changes: Update trusted AI pipeline to have unique job names so it can run multiple times without cleaning up the jobs.

Environment tested:

  • Python Version (use python --version):
  • Tekton Version (use tkn version):
  • Kubernetes Version (use kubectl version):
  • OS (e.g. from /etc/os-release):

Checklist:

Tomcli
Tomcli

This can work on both Argo and Tekton runtime because we have workflow name substitution in KFP-TEKTON

pull request

Tomcli pull request kubeflow/kfp-tekton

Tomcli
Tomcli

Update trusted AI pipeline to have unique job names

Which issue is resolved by this Pull Request: Resolves #

Description of your changes: Update trusted AI pipeline to have unique job names so it can run multiple times without cleaning up the jobs.

Environment tested:

  • Python Version (use python --version):
  • Tekton Version (use tkn version):
  • Kubernetes Version (use kubectl version):
  • OS (e.g. from /etc/os-release):

Checklist:

Activity icon
created branch

Tomcli in kubeflow/kfp-tekton create branch Tomcli-patch-3

createdAt 3 days ago
Previous