Skip to content

Unsuccessful diffs forgotten after conflicts are resolved #152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dracic opened this issue Apr 26, 2025 · 0 comments
Open

Unsuccessful diffs forgotten after conflicts are resolved #152

dracic opened this issue Apr 26, 2025 · 0 comments
Assignees

Comments

@dracic
Copy link

dracic commented Apr 26, 2025

Issue: Unapplied changes lost after conflict and subsequent successful sync

Description:

When a db-sync pull operation results in conflicts, and these conflicts are not resolved before a subsequent successful sync occurs, the unapplied changes from the initial conflicted sync appear to be lost or forgotten by db-sync.

Steps to Reproduce:

  1. Perform a db-sync pull operation where changes in the Mergin Maps project conflict with changes in the local database's 'modified' schema. Observe that conflicts are reported.
  2. Retry the dbsync operation. Conflicts are still mentioned, indicating the previous conflicts were not fully resolved or applied.
  3. Introduce new, non-conflicting changes to the Mergin Maps project.
  4. Perform another db-sync pull operation. This sync completes successfully, and the database is updated with the new changes from step 3.
  5. Observe that the unapplied changes from step 1 (the initial conflicted sync) are no longer present in the 'modified' schema and are not considered pending changes by db-sync.

Expected Behavior:

db-sync should retain and continue to track unapplied changes resulting from conflicts, prompting the user or providing a mechanism to resolve and apply them, even after subsequent successful sync operations.

Observed Behavior:

Unapplied changes from a conflicted sync are lost after a subsequent successful sync that updates the 'base' schema.

Possible Cause:

Based on the analysis of the db-sync code (dbsync.py) and the behavior of geodiff, the issue likely stems from how conflicts are handled during the rebase process and the subsequent state management.

The pull function in dbsync.py uses _geodiff_rebase when local database changes exist (tmp_base2our is not empty) and there are incoming changes from Mergin Maps (tmp_base2their).

Relevant code snippet from dbsync.py:

    if not needs_rebase:
        logging.debug("Applying new version [no rebase]")
        _geodiff_apply_changeset(conn_cfg.driver, conn_cfg.conn_info, conn_cfg.base, tmp_base2their, ignored_tables)
        _geodiff_apply_changeset(conn_cfg.driver, conn_cfg.conn_info, conn_cfg.modified, tmp_base2their, ignored_tables)
    else:
        logging.debug("Applying new version [WITH rebase]")
        tmp_conflicts = os.path.join(tmp_dir, f"{project_name}-dbsync-pull-conflicts")
        _geodiff_rebase(
            conn_cfg.driver,
            conn_cfg.conn_info,
            conn_cfg.base,
            conn_cfg.modified,
            tmp_base2their,
            tmp_conflicts,
            ignored_tables,
        )
        _geodiff_apply_changeset(conn_cfg.driver, conn_cfg.conn_info, conn_cfg.base, tmp_base2their, ignored_tables)

The _geodiff_rebase function generates a conflict file (tmp_conflicts), but the current db-sync logic does not appear to explicitly read or process this file to re-attempt applying the conflicted changes.

When a conflict occurs during rebase-db, some changes might not be applied to the 'modified' schema. Subsequent dbsync retries might hit the same conflicts if the state hasn't changed. However, if a new sync (Sync 2) occurs with changes that do not conflict with the current state of the 'modified' schema, the rebase-db might succeed for these new changes. Crucially, the _geodiff_apply_changeset call after the rebase applies the tmp_base2their changeset (representing the Mergin Maps changes relative to the original base before the rebase) to the 'base' schema. This updates the 'base' schema to the state after Sync 2.

The unapplied changes from Sync 1, if not successfully integrated into the 'modified' schema during the initial conflicted rebase, are now compared against a new 'base' schema. This change in the reference point can cause db-sync to no longer correctly identify these unapplied changes as pending, effectively losing them from the synchronization process.

Why is this important? We well never be able to map Geopackage to PostgreSQL types perfectly, and this errors will appear sometimes, especially when we use PostgreSQL as the init source, so it is importan that we don't lose conflicted changesets. Sometimes this issues can be resolved with minimal tweaking on PostgreSQL side without another inital sync.

@ValentinBuira ValentinBuira self-assigned this May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants