Skip to content

[BUG]: Transpile CLI cannot read the configuration file it wrote to the workspace #1541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
asnare opened this issue Apr 29, 2025 · 2 comments
Open
1 task done
Labels
bug Something isn't working

Comments

@asnare
Copy link
Contributor

asnare commented Apr 29, 2025

Is there an existing issue for this?

  • I have searched the existing issues

Category of Bug / Issue

Other

Current Behavior

During routine installation of the pluggable transpiler components, a configuration file (~/.remorph/config.json) is written into the user's workspace home directory. This is loaded when transpiling, but crashes because the YAML cannot be deserialised.

Expected Behavior

The CLI should be able to read the configuration file that it wrote.

Steps To Reproduce

  1. Install remorph@main (4e262a932eaba59d486af8a91e111f1a7a6a0c11) using the Databricks CLI:

    databricks labs install remorph@main
  2. Install the transpiler plugins:

    databricks labs remorph install-transpile

    Accept default answers where possible. For questions without defaults I used:

    • Select the source dialect: tsql
    • Enter input SQL path (directory/file): /tmp/foobar
  3. Run the transpiler. Due to [BUG]: Transpile invocation fails without most arguments being provided #1538 the arguments will need to be provided, but it will load the configuration from the workspace:

    databricks labs remorph transpile --transpiler-config-path "${HOME}/.databricks/labs/remorph-transpilers/remorph-community-transpiler/lib/config.yml" --source-dialect tsql --input-source /tmp/foobar --output-folder transpiled --error-file-path errors.log --catalog-name remorph --schema-name transpiler --debug

The transpiler will fail to start due to an error:

databricks.labs.blueprint.installation.SerdeError: transpiler_options.-experimental: unknown: <class 'object'>: False

The configuration file that was written to the workspace (~/.remorph/config.json) is this:

catalog_name: remorph
error_file_path: errors.log
input_source: /tmp/foobar
output_folder: transpiled
schema_name: transpiler
skip_validation: true
source_dialect: tsql
transpiler_config_path: /Users/**REDACTED**/.databricks/labs/remorph-transpilers/remorph-community-transpiler/lib/config.yml
transpiler_options:
  -experimental: false
version: 3

Relevant log output or Exception details

databricks labs remorph transpile --transpiler-config-path "${HOME}/.databricks/labs/remorph-transpilers/remorph-community-transpiler/lib/config.yml" --source-dialect tsql --input-source /tmp/foobar --output-folder transpiled --error-file-path errors.log --catalog-name remorph --schema-name transpiler --debug
15:53:35 Info: start pid=65187 version=0.249.0 args="databricks, labs, remorph, transpile, --transpiler-config-path, /Users/andrew.s
nare/.databricks/labs/remorph-transpilers/remorph-community-transpiler/lib/config.yml, --source-dialect, tsql, --input-source, /tmp/
foobar, --output-folder, transpiled, --error-file-path, errors.log, --catalog-name, remorph, --schema-name, transpiler, --debug"
15:53:35 Debug: Loading installed version info from: /Users/**REDACTED**/.databricks/labs/remorph/state/version.json pid=65187
[UPGRADE ADVISED] Newer remorph version was released 147 days ago. Please run `databricks labs upgrade remorph` to upgrade: main ->
v0.9.0
15:53:35 Debug: Loading login configuration from: /Users/**REDACTED**/.databricks/labs/remorph/config/login.json pid=65187
15:53:35 Debug: Using workspace-level login profile: default pid=65187
15:53:35 Debug: Loading default profile from /Users/**REDACTED**/.databrickscfg pid=65187 sdk=true
15:53:35 Debug: Resolved login: Config: host=https://adb-**REDACTED**.2.azuredatabricks.net, profile=default, config_file=/Users
/**REDACTED**/.databrickscfg pid=65187 sdk=true
15:53:35 Debug: Passing down environment variables: DATABRICKS_HOST, DATABRICKS_AUTH_TYPE pid=65187
15:53:35 Debug: Forwarding subprocess: /Users/**REDACTED**/.databricks/labs/remorph/state/venv/bin/python3 /Users/**REDACTED**/.data
bricks/labs/remorph/lib/src/databricks/labs/remorph/cli.py {"command":"transpile","flags":{"catalog-name":"remorph","error-file-path
":"errors.log","input-source":"/tmp/foobar","log_level":"debug","output-folder":"transpiled","schema-name":"transpiler","skip-valida
tion":"true","source-dialect":"tsql","transpiler-config-path":"/Users/**REDACTED**/.databricks/labs/remorph-transpilers/remorph-comm
unity-transpiler/lib/config.yml"},"output_type":""} pid=65187
15:53:35 Debug: starting: /Users/**REDACTED**/.databricks/labs/remorph/state/venv/bin/python3 /Users/**REDACTED**/.databricks/labs/r
emorph/lib/src/databricks/labs/remorph/cli.py {"command":"transpile","flags":{"catalog-name":"remorph","error-file-path":"errors.log
","input-source":"/tmp/foobar","log_level":"debug","output-folder":"transpiled","schema-name":"transpiler","skip-validation":"true",
"source-dialect":"tsql","transpiler-config-path":"/Users/**REDACTED**/.databricks/labs/remorph-transpilers/remorph-community-transpi
ler/lib/config.yml"},"output_type":""} pid=65187
15:53:36 DEBUG [databricks.sdk] Loaded from environment
15:53:36 DEBUG [databricks.sdk] Ignoring pat auth, because databricks-cli is preferred
15:53:36 DEBUG [databricks.sdk] Ignoring basic auth, because databricks-cli is preferred
15:53:36 DEBUG [databricks.sdk] Ignoring metadata-service auth, because databricks-cli is preferred
15:53:36 DEBUG [databricks.sdk] Ignoring oauth-m2m auth, because databricks-cli is preferred
15:53:36 DEBUG [databricks.sdk] Ignoring azure-client-secret auth, because databricks-cli is preferred
15:53:36 DEBUG [databricks.sdk] Ignoring github-oidc-azure auth, because databricks-cli is preferred
15:53:36 DEBUG [databricks.sdk] Ignoring azure-cli auth, because databricks-cli is preferred
15:53:36 DEBUG [databricks.sdk] Ignoring external-browser auth, because databricks-cli is preferred
15:53:36 DEBUG [databricks.sdk] Attempting to configure auth: databricks-cli
15:53:36  INFO [databricks.sdk] Using Databricks CLI authentication
15:53:36 DEBUG [d.sdk.useragent] Adding cmd/execute-transpile to User-Agent
15:53:38 DEBUG [databricks.sdk] GET /api/2.0/preview/scim/v2/Me
< 200 OK
< {
<   "active": true,
<   "displayName": "**REDACTED**",
<   "emails": [
<     {
<       "primary": true,
<       "type": "work",
<       "value": "**REDACTED**"
<     }
<   ],
<   "externalId": "c8725ce9-35c9-40e2-982e-9bba117752de",
<   "groups": [
<     {
<       "$ref": "Groups/**REDACTED**",
<       "display": "**REDACTED**",
<       "type": "direct",
<       "value": "**REDACTED**"
<     },
<     "... (3 additional elements)"
<   ],
<   "id": "**REDACTED**",
<   "name": {
<     "givenName": "**REDACTED**"
<   },
<   "schemas": [
<     "urn:ietf:params:scim:schemas:core:2.0:User",
<     "... (1 additional elements)"
<   ],
<   "userName": "**REDACTED**@**REDACTED**"
< }
15:53:39 DEBUG [databricks.sdk] GET /api/2.0/preview/scim/v2/Me
< 200 OK
< {
<   "active": true,
<   "displayName": "**REDACTED**",
<   "emails": [
<     {
<       "primary": true,
<       "type": "work",
<       "value": "**REDACTED**"
<     }
<   ],
<   "externalId": "**REDACTED**",
<   "groups": [
<     {
<       "$ref": "Groups/**REDACTED**",
<       "display": "**REDACTED**",
<       "type": "direct",
<       "value": "**REDACTED**"
<     },
<     "... (3 additional elements)"
<   ],
<   "id": "**REDACTED**",
<   "name": {
<     "givenName": "**REDACTED**"
<   },
<   "schemas": [
<     "urn:ietf:params:scim:schemas:core:2.0:User",
<     "... (1 additional elements)"
<   ],
<   "userName": "**REDACTED**@**REDACTED**"
< }
15:53:39 DEBUG [d.l.blueprint.installation] Loading TranspileConfig from config.yml
15:53:39 DEBUG [databricks.sdk] GET /api/2.0/workspace/export?path=/Users/**REDACTED**@databricks.com/.remorph/config.yml&direct_dow
nload=true
< 200 OK
< [raw stream]
15:53:39 ERROR [d.l.remorph.transpile] Failed to call transpile: Traceback (most recent call last):
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/cli.py", line
 113, in _route
    cmd.fn(**kwargs)
  File "/Users/**REDACTED**/.databricks/labs/remorph/lib/src/databricks/labs/remorph/cli.py", line 104, in transpile
    default_config = ctx.transpile_config
  File "/usr/local/Cellar/[email protected]/3.10.17/Frameworks/Python.framework/Versions/3.10/lib/python3.10/functools.py", line 981, in _
_get__
    val = self.func(instance)
  File "/Users/**REDACTED**/.databricks/labs/remorph/lib/src/databricks/labs/remorph/contexts/application.py", line 54, in transpile
_config
    return self.installation.load(TranspileConfig)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 210, in load
    return self._unmarshal_type(as_dict, filename, type_ref)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 378, in _unmarshal_type
    return cls._unmarshal(as_dict, [], type_ref)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 620, in _unmarshal
    return cls._unmarshal_dataclass(inst, path, type_ref)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 666, in _unmarshal_dataclass
    value = cls._unmarshal(raw, [*path, field_name], hint)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 635, in _unmarshal
    return cls._unmarshal_generic_types(type_ref, path, inst)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 646, in _unmarshal_generic_types
    return cls._unmarshal_union(inst, path, type_ref)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 685, in _unmarshal_union
    value = cls._unmarshal(inst, path, variant)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 635, in _unmarshal
    return cls._unmarshal_generic_types(type_ref, path, inst)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 648, in _unmarshal_generic_types
    return cls._unmarshal_generic(inst, path, type_ref)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 705, in _unmarshal_generic
    return cls._unmarshal_dict(inst, path, type_args[1])
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 729, in _unmarshal_dict
    from_dict[k] = cls._unmarshal(v, [*path, k], type_ref)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 635, in _unmarshal
    return cls._unmarshal_generic_types(type_ref, path, inst)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 646, in _unmarshal_generic_types
    return cls._unmarshal_union(inst, path, type_ref)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 685, in _unmarshal_union
    value = cls._unmarshal(inst, path, variant)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 635, in _unmarshal
    return cls._unmarshal_generic_types(type_ref, path, inst)
  File "/Users/**REDACTED**/.databricks/labs/remorph/state/venv/lib/python3.10/site-packages/databricks/labs/blueprint/installation.
py", line 649, in _unmarshal_generic_types
    raise SerdeError(f'{".".join(path)}: unknown: {type_ref}: {inst}')
databricks.labs.blueprint.installation.SerdeError: transpiler_options.-experimental: unknown: <class 'object'>: False
Error: unexpected end of JSON input
15:53:39 Info: failed execution pid=65187 exit_code=1 error="unexpected end of JSON input"
15:53:39 Debug: no telemetry logs to upload pid=65187

Operating System

macOS

Version

latest via Databricks CLI

@asnare asnare added the bug Something isn't working label Apr 29, 2025
@ericvergnaud
Copy link
Contributor

ericvergnaud commented Apr 29, 2025

I have also bumped into this bug. I suspect it is already fixed by #1488 and the underlying fixes in databrickslabs/blueprint#189. @gueniai Can you validate databrickslabs/blueprint#189 in order for #1488 to be mergeable ?

@gueniai
Copy link
Collaborator

gueniai commented May 1, 2025

Alternatively, I don't believe we need the config to be written to the workspace, can we simply remove that step?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants