Skip to content

Crash during Deployment of model on Vertex AI #154

@shinchri

Description

@shinchri

Chapter 9 within "Deploy model to Vertex AI" section in flights_model_tf2.ipynb. The deployment crashes when you try to execute the first cell:

...
# upload model
gcloud beta ai models upload --region=$REGION --display-name=$MODEL_NAME \
     --container-image-uri=us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.${TF_VERSION}:latest \
     --artifact-uri=$EXPORT_PATH
MODEL_ID=$(gcloud ai models list --region=$REGION --format='value(MODEL_ID)' --filter=display_name=${MODEL_NAME})
echo "MODEL_ID=$MODEL_ID"

# deploy model to endpoint
gcloud ai endpoints deploy-model $ENDPOINT_ID \
  --region=$REGION \
  --model=$MODEL_ID \
  --display-name=$MODEL_NAME \
  --machine-type=n1-standard-2 \
  --min-replica-count=1 \
  --max-replica-count=1 \
  --traffic-split=0=100

When I check the Vertex Endpoints, one was created but something else seems to have gone wrong.

Output:
gs://tribbute-ml-central/ch9/trained_model/export/flights_20220803-222758/
Creating Endpoint for flights-20220803-223154
ENDPOINT_ID=974809417000157184
MODEL_ID=

followed by very long error (the error was too long so I pasted part of it):

Using endpoint [https://us-central1-aiplatform.googleapis.com/]
WARNING: The following filter keys were not present in any resource : display_name
Using endpoint [https://us-central1-aiplatform.googleapis.com/]
Waiting for operation [7706081518493368320]...
.....done.
Created Vertex AI endpoint: projects/591020730428/locations/us-central1/endpoints/974809417000157184.
Using endpoint [https://us-central1-aiplatform.googleapis.com/]
Using endpoint [https://us-central1-aiplatform.googleapis.com/]
ERROR: gcloud crashed (InvalidDataFromServerError): Error decoding response "{
  "models": [
    {
      "name": "projects/591020730428/locations/us-central1/models/1316788319564070912",
      "displayName": "flights-20220803-223002",
      "predictSchemata": {},
      "containerSpec": {
        "imageUri": "us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-9:latest"
      },
      "supportedDeploymentResourcesTypes": [
        "DEDICATED_RESOURCES"
      ],
      "supportedInputStorageFormats": [
        "jsonl",
        "bigquery",
        "csv",
        "tf-record",
        "tf-record-gzip",
        "file-list"
      ],
      "supportedOutputStorageFormats": [
        "jsonl",
        "bigquery"
      ],
      "createTime": "2022-08-03T22:30:12.377079Z",
      "updateTime": "2022-08-03T22:30:14.993220Z",
      "etag": "AMEw9yOIRZqfqqO_ngaA77Jw8Fs9E_kcI8tkqAIsTzFViX-aIrRbHfc0d2HRBihT32rp",
      "supportedExportFormats": [
        {
          "id": "custom-trained",
          "exportableContents": [
            "ARTIFACT"
          ]
        }
      ],
...


If you would like to report this issue, please run the following command:
  gcloud feedback

To check gcloud for common problems, please run the following command:
  gcloud info --run-diagnostics
Using endpoint [https://us-central1-aiplatform.googleapis.com/]
ERROR: (gcloud.ai.endpoints.deploy-model) could not parse resource []
---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
/tmp/ipykernel_1/3503756464.py in <module>
----> 1 get_ipython().run_cell_magic('bash', '', '# note TF_VERSION and ENDPOINT_NAME set in 1st cell\n# TF_VERSION=2-6\n# ENDPOINT_NAME=flights\n\nTIMESTAMP=$(date +%Y%m%d-%H%M%S)\nMODEL_NAME=${ENDPOINT_NAME}-${TIMESTAMP}\nEXPORT_PATH=$(gsutil ls ${OUTDIR}/export | tail -1)\necho $EXPORT_PATH\n\nif [[ $(gcloud ai endpoints list --region=$REGION \\\n        --format=\'value(DISPLAY_NAME)\' --filter=display_name=${ENDPOINT_NAME}) ]]; then\n    echo "Endpoint for $MODEL_NAME already exists"\nelse\n    # create model\n    echo "Creating Endpoint for $MODEL_NAME"\n    gcloud ai endpoints create --region=${REGION} --display-name=${ENDPOINT_NAME}\nfi\n\nENDPOINT_ID=$(gcloud ai endpoints list --region=$REGION \\\n              --format=\'value(ENDPOINT_ID)\' --filter=display_name=${ENDPOINT_NAME})\necho "ENDPOINT_ID=$ENDPOINT_ID"\n\n# delete any existing models with this name\nfor MODEL_ID in $(gcloud ai models list --region=$REGION --format=\'value(MODEL_ID)\' --filter=display_name=${MODEL_NAME}); do\n    echo "Deleting existing $MODEL_NAME ... $MODEL_ID "\n    gcloud ai models delete --region=$REGION $MODEL_ID\ndone\n\n# upload model\ngcloud beta ai models upload --region=$REGION --display-name=$MODEL_NAME \\\n     --container-image-uri=us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.${TF_VERSION}:latest \\\n     --artifact-uri=$EXPORT_PATH\nMODEL_ID=$(gcloud ai models list --region=$REGION --format=\'value(MODEL_ID)\' --filter=display_name=${MODEL_NAME})\necho "MODEL_ID=$MODEL_ID"\n\n# deploy model to endpoint\ngcloud ai endpoints deploy-model $ENDPOINT_ID \\\n  --region=$REGION \\\n  --model=$MODEL_ID \\\n  --display-name=$MODEL_NAME \\\n  --machine-type=n1-standard-2 \\\n  --min-replica-count=1 \\\n  --max-replica-count=1 \\\n  --traffic-split=0=100\n')

/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
   2470             with self.builtin_trap:
   2471                 args = (magic_arg_s, cell)
-> 2472                 result = fn(*args, **kwargs)
   2473             return result
   2474 

/opt/conda/lib/python3.7/site-packages/IPython/core/magics/script.py in named_script_magic(line, cell)
    140             else:
    141                 line = script
--> 142             return self.shebang(line, cell)
    143 
    144         # write a basic docstring:

/opt/conda/lib/python3.7/site-packages/decorator.py in fun(*args, **kw)
    230             if not kwsyntax:
    231                 args, kw = fix(args, kw, sig)
--> 232             return caller(func, *(extras + args), **kw)
    233     fun.__name__ = func.__name__
    234     fun.__doc__ = func.__doc__

/opt/conda/lib/python3.7/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    185     # but it's overkill for just that one bit of state.
    186     def magic_deco(arg):
--> 187         call = lambda f, *a, **k: f(*a, **k)
    188 
    189         if callable(arg):

/opt/conda/lib/python3.7/site-packages/IPython/core/magics/script.py in shebang(self, line, cell)
    243             sys.stderr.flush()
    244         if args.raise_error and p.returncode!=0:
--> 245             raise CalledProcessError(p.returncode, cell, output=out, stderr=err)
    246 
    247     def _run_script(self, p, cell, to_close):

CalledProcessError: Command 'b'# note TF_VERSION and ENDPOINT_NAME set in 1st cell\n# TF_VERSION=2-6\n# ENDPOINT_NAME=flights\n\nTIMESTAMP=$(date +%Y%m%d-%H%M%S)\nMODEL_NAME=${ENDPOINT_NAME}-${TIMESTAMP}\nEXPORT_PATH=$(gsutil ls ${OUTDIR}/export | tail -1)\necho $EXPORT_PATH\n\nif [[ $(gcloud ai endpoints list --region=$REGION \\\n        --format=\'value(DISPLAY_NAME)\' --filter=display_name=${ENDPOINT_NAME}) ]]; then\n    echo "Endpoint for $MODEL_NAME already exists"\nelse\n    # create model\n    echo "Creating Endpoint for $MODEL_NAME"\n    gcloud ai endpoints create --region=${REGION} --display-name=${ENDPOINT_NAME}\nfi\n\nENDPOINT_ID=$(gcloud ai endpoints list --region=$REGION \\\n              --format=\'value(ENDPOINT_ID)\' --filter=display_name=${ENDPOINT_NAME})\necho "ENDPOINT_ID=$ENDPOINT_ID"\n\n# delete any existing models with this name\nfor MODEL_ID in $(gcloud ai models list --region=$REGION --format=\'value(MODEL_ID)\' --filter=display_name=${MODEL_NAME}); do\n    echo "Deleting existing $MODEL_NAME ... $MODEL_ID "\n    gcloud ai models delete --region=$REGION $MODEL_ID\ndone\n\n# upload model\ngcloud beta ai models upload --region=$REGION --display-name=$MODEL_NAME \\\n     --container-image-uri=us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.${TF_VERSION}:latest \\\n     --artifact-uri=$EXPORT_PATH\nMODEL_ID=$(gcloud ai models list --region=$REGION --format=\'value(MODEL_ID)\' --filter=display_name=${MODEL_NAME})\necho "MODEL_ID=$MODEL_ID"\n\n# deploy model to endpoint\ngcloud ai endpoints deploy-model $ENDPOINT_ID \\\n  --region=$REGION \\\n  --model=$MODEL_ID \\\n  --display-name=$MODEL_NAME \\\n  --machine-type=n1-standard-2 \\\n  --min-replica-count=1 \\\n  --max-replica-count=1 \\\n  --traffic-split=0=100\n'' returned non-zero exit status 1.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions