Overview#

This section aim to provide a detailed explanation of possible usecases.

All examples provided here have a corresponding testcase in the example/demo/tests repo directory, where you can tinker with them.

Following examples will work with following models:

from django.db import models


class Company(models.Model):
    name = models.CharField(max_length=100)
    address = models.CharField(max_length=100)


class Project(models.Model):
    name = models.CharField(max_length=100)
    company = models.ForeignKey(
        Company, related_name="projects", on_delete=models.CASCADE
    )


class Employee(models.Model):
    name = models.CharField(max_length=100)
    company = models.ForeignKey(
        Company, related_name="employees", on_delete=models.CASCADE
    )


class Counterpart(models.Model):
    name = models.CharField(max_length=100)
    external_id = models.IntegerField()
    project = models.ForeignKey(
        Project, related_name="counterparts", on_delete=models.CASCADE
    )


class Task(models.Model):
    name = models.CharField(max_length=100)
    description = models.TextField()

    assignee = models.ForeignKey(
        Employee, related_name="tasks", on_delete=models.CASCADE
    )
    project = models.ForeignKey(Project, related_name="tasks", on_delete=models.CASCADE)
    counterparts = models.ManyToManyField(Counterpart, related_name="tasks")

Copy with taking data from original model#

This is the most common usecase. You have a model and you want to copy it to another model. You can do this by using the TAKE_FROM_ORIGIN shortcut.

company = Company.objects.create(name="Company", address="Address")

config = ModelCopyConfig(
    model=Company,
    filter_field_to_input_key={"id": "company_id"},
    field_copy_actions={
        "name": TAKE_FROM_ORIGIN,
        "address": TAKE_FROM_ORIGIN,
    },
)
copy_request = CopyRequest(
    config=CopyistConfig([config]),
    input_data={"company_id": company.id},
    confirm_write=False,
)
result = Copyist(copy_request).execute_copy_request()
assert result.is_copy_successful
assert len(result.output_map["Company"]) == 1
new_company_id = result.output_map["Company"][str(company.id)]
new_company = Company.objects.get(id=new_company_id)
assert new_company.name == company.name

With this configuration, the name and address fields of the new company will be copied from the original company.

Note

The TAKE_FROM_ORIGIN is a shortcut for the FieldCopyConfig class. You can also use the FieldCopyConfig class directly.

Copy with taking data from external source#

Sometimes you want to copy a model but you want to set value from some other source. You can do this by using the TAKE_FROM_INPUT action.

company = Company.objects.create(name="Company", address="Address")

config = ModelCopyConfig(
    model=Company,
    filter_field_to_input_key={"id": "company_id"},
    field_copy_actions={
        "name": FieldCopyConfig(action=CopyActions.TAKE_FROM_INPUT, input_key="new_name"),
        "address": TAKE_FROM_ORIGIN,
    },
)
copy_request = CopyRequest(
    config=CopyistConfig([config]),
    input_data={"company_id": company.id, "new_name": "New Company"},
    confirm_write=False,
)
result = Copyist(copy_request).execute_copy_request()
new_company_id = result.output_map["Company"][str(company.id)]
new_company = Company.objects.get(id=new_company_id)
assert new_company.name == "New Company"

That can be useful if you want to copy model, but it has some unique restrictions, so you use it to override unique fields.

Handling denormalized data#

Sometimes life (or business) forces you to have your data denormalized. And with that it can be tricky to copy hierarchical data, as top level model references could be reused in some of lower level models.

Using copyist and UpdateToCopied() action you can handle this case.

company = Company.objects.create(name="Company", address="Address")
project = Project.objects.create(name="Project", company=company)
counterpart = Counterpart.objects.create(
    name="Counterpart", external_id=1, project=project
)
counterpart2 = Counterpart.objects.create(
    name="Counterpart2", external_id=2, project=project
)

employee = Employee.objects.create(name="Employee", company=company)
task = Task.objects.create(
    name="Task",
    description="Description",
    assignee=employee,
    project=project,
)
task.counterparts.add(counterpart, counterpart2)

config = ModelCopyConfig(
    model=Company,
    filter_field_to_input_key={"id": "company_id"},
    field_copy_actions={
        "name": FieldCopyConfig(
            action=CopyActions.TAKE_FROM_INPUT,
            input_key="new_company_name",
        ),
        "address": FieldCopyConfig(
            action=CopyActions.TAKE_FROM_INPUT,
            input_key="new_company_address",
        ),
        "projects": MakeCopy(
            ModelCopyConfig(
                model=Project,
                field_copy_actions={
                    "name": TAKE_FROM_ORIGIN,
                    "counterparts": MakeCopy(
                        ModelCopyConfig(
                            model=Counterpart,
                            field_copy_actions={
                                "name": TAKE_FROM_ORIGIN,
                                "external_id": TAKE_FROM_ORIGIN,
                            },
                        )
                    ),
                },
            )
        ),
        "employees": MakeCopy(
            ModelCopyConfig(
                model=Employee,
                field_copy_actions={
                    "name": TAKE_FROM_ORIGIN,
                },
            )
        ),
    },
    compound_copy_actions=[
        ModelCopyConfig(
            model=Task,
            field_copy_actions={
                "name": TAKE_FROM_ORIGIN,
                "description": TAKE_FROM_ORIGIN,
                "counterparts": UpdateToCopied(Counterpart),
                "project": UpdateToCopied(Project),
                "assignee": UpdateToCopied(Employee),
            },
        )
    ],
)

new_company_name = "New Company"
new_company_address = "New Address"
copy_request = CopyRequest(
    config=CopyistConfig([config]),
    input_data={
        "company_id": company.id,
        "new_company_name": new_company_name,
        "new_company_address": new_company_address,
    },
    confirm_write=False,
)
result = Copyist(copy_request).execute_copy_request()

Here we use UpdateToCopied() action to update the reference to the previously copied models. The way it works is that Copyist, as it copies data, stores the mapping of the original model id to the new model id. Then, when it encounters the UpdateToCopied() action, it uses this mapping to update the reference to the copied model.

Note

The UpdateToCopied is a shortcut for the FieldCopyConfig class. You can also use the FieldCopyConfig class directly.

It is not limited to use in compound actions, you can use it in ModelCopyConfig.field_copy_actions as well.

Copying data with multiple parent models#

If we take a closer look at previous example, we will see that ModelCopyConfig.compound_copy_actions is used.

This attribute stores list of ModelCopyConfig objects, which will be executed after all ModelCopyConfig.field_copy_actions are executed.

This way you can first copy all parent models, and then use compound actions to create model, that relies on multiple parent models.

Closer look at CopyRequest and CopyResult#

You probably noticed the CopyRequest and CopyResult classes that are used in the examples above. Let’s take a closer look at them.

copy_request = CopyRequest(
    config=CopyistConfig([config]),
    input_data={"company_id": company.id, "new_name": "New Company"},
    confirm_write=False,
)

In this example, we create a CopyRequest object.

CopyistConfig is a class that holds the configuration for the copy process. It takes a list of ModelCopyConfig objects. It is root config and can have multiple ModelCopyConfig objects if you need to copy several root level models in one request.

input_data is a dictionary that holds the input data for the copy process. It is used to pass data to the copy process. It can be used to pass data that is not present in the original model.

confirm_write is a more confusing one. It is a boolean that tells the copy process if it should write the data even if unmatched or ignored values were discovered during the copy process.

What are unmatched or ignored values? Let’s take a look at the CopyResult object.

CopyResult is an object that holds the result of the copy process.

Primarily you should look at attribute is_copy_successful. It is a boolean that tells you if the copy process was successful. If it is False you should look at the reason attribute. It is a enum that tells you why the copy process failed.

output_map is a dictionary that holds the mapping of the original model id to the new model id. It can be stored for historical purposes or to be used for UI rendering. This field is populated only on successful copy.

If you copy is unsuccessful, you can look at the set_to_filter_map and ignored_map attributes. They are dictionaries that hold the mapping of the original model id to matched ids on set_to_filter_map and ignored fields on SET_TO_FILTER action or ignore_condition respectively.

Why would you use this attributes? Let’s see following examples

Setting attribute to filtered value#

Sometimes, when you need to copy model that is not just top level model, but exist in some kind of existing hierarchy, you need to set some attribute to the value of the parent model that is already exists in target context. You can do this by using the SET_TO_FILTER action.

company = Company.objects.create(name="Company", address="Address")
project1 = Project.objects.create(name="Project1", company=company)
project2 = Project.objects.create(name="Project2", company=company)
counterpart1 = Counterpart.objects.create(
    name="Counterpart", external_id=1, project=project1
)
counterpart2 = Counterpart.objects.create(
    name="Counterpart", external_id=1, project=project2
)
employee = Employee.objects.create(name="Employee", company=company)
task = Task.objects.create(
    name="Task",
    description="Description",
    assignee=employee,
    project=project1,
)
task.counterparts.add(counterpart1)

config = ModelCopyConfig(
    model=Employee,
    filter_field_to_input_key={"id": "employee_id"},
    field_copy_actions={
        "name": TAKE_FROM_ORIGIN,
        "company": TAKE_FROM_ORIGIN,
        "tasks": MakeCopy(
            ModelCopyConfig(
                model=Task,
                field_copy_actions={
                    "name": TAKE_FROM_ORIGIN,
                    "description": TAKE_FROM_ORIGIN,
                    "project_id": FieldCopyConfig(
                        action=CopyActions.TAKE_FROM_INPUT,
                        input_key="new_project_id",
                    ),
                    "counterparts": FieldCopyConfig(
                        action=CopyActions.SET_TO_FILTER,
                        reference_to=Counterpart,
                        filter_config=FilterConfig(
                            filters={
                                "project_id": FieldFilterConfig(
                                    source=FilterSource.FROM_INPUT,
                                    key="new_project_id",
                                ),
                                "external_id": FieldFilterConfig(
                                    source=FilterSource.FROM_ORIGIN
                                ),
                            }
                        ),
                    ),
                },
            )
        ),
    },
)

result = Copyist(
    CopyRequest(
        config=CopyistConfig([config]),
        input_data={
            "employee_id": employee.id,
            "new_project_id": project2.id,
        },
        confirm_write=False,
    )
).execute_copy_request()

assert result.is_copy_successful
assert len(result.output_map["Employee"]) == 1
assert len(result.output_map["Task"]) == 1

new_task_id = result.output_map["Task"][str(task.id)]
new_task = Task.objects.get(id=new_task_id)
assert new_task.project_id == project2.id
assert new_task.counterparts.count() == 1
new_counterpart = new_task.counterparts.first()
assert new_counterpart.project_id == project2.id
assert new_counterpart.external_id == counterpart2.external_id
assert new_counterpart.id != counterpart1.id
assert new_counterpart.id == counterpart2.id

In this example, we copy an employee with all his tasks from one project to another. Here - Counterparts models is linked to Project and when we copy Task model we want to match tasks with the similar Counterparts as in the original task but in the new project. In this case similarity is defined by external_id field. So we use SET_TO_FILTER action to set new Counterparts to the copied Task model.

Set to filter matching gone wrong#

Above example is great and works well, but what if destination project doesn’t have corresponding Counterpart?

company = Company.objects.create(name="Company", address="Address")
project1 = Project.objects.create(name="Project1", company=company)
project2 = Project.objects.create(name="Project2", company=company)
counterpart1 = Counterpart.objects.create(
    name="Counterpart", external_id=1, project=project1
)
counterpart2 = Counterpart.objects.create(
    name="Counterpart 2", external_id=2, project=project1
)
counterpart3 = Counterpart.objects.create(
    name="Counterpart", external_id=1, project=project2
)
employee = Employee.objects.create(name="Employee", company=company)
task = Task.objects.create(
    name="Task",
    description="Description",
    assignee=employee,
    project=project1,
)
task.counterparts.add(counterpart1, counterpart2)

config = ModelCopyConfig(
    model=Employee,
    filter_field_to_input_key={"id": "employee_id"},
    field_copy_actions={
        "name": TAKE_FROM_ORIGIN,
        "company": TAKE_FROM_ORIGIN,
        "tasks": MakeCopy(
            ModelCopyConfig(
                model=Task,
                field_copy_actions={
                    "name": TAKE_FROM_ORIGIN,
                    "description": TAKE_FROM_ORIGIN,
                    "project_id": FieldCopyConfig(
                        action=CopyActions.TAKE_FROM_INPUT,
                        input_key="new_project_id",
                    ),
                    "counterparts": FieldCopyConfig(
                        action=CopyActions.SET_TO_FILTER,
                        reference_to=Counterpart,
                        filter_config=FilterConfig(
                            filters={
                                "project_id": FieldFilterConfig(
                                    source=FilterSource.FROM_INPUT,
                                    key="new_project_id",
                                ),
                                "external_id": FieldFilterConfig(
                                    source=FilterSource.FROM_ORIGIN
                                ),
                            }
                        ),
                    ),
                },
            )
        ),
    },
)

result = Copyist(
    CopyRequest(
        config=CopyistConfig([config]),
        input_data={
            "employee_id": employee.id,
            "new_project_id": project2.id,
        },
        confirm_write=False,
    )
).execute_copy_request()

assert not result.is_copy_successful
assert result.reason == AbortReason.NOT_MATCHED
assert result.set_to_filter_map[Task.__name__]["counterparts"] == {
    str(counterpart1.id): str(counterpart3.id),
    str(counterpart2.id): None,
}

result = Copyist(
    CopyRequest(
        config=CopyistConfig([config]),
        input_data={
            "employee_id": employee.id,
            "new_project_id": project2.id,
        },
        confirm_write=True,
        set_to_filter_map=result.set_to_filter_map,
        ignored_map=result.ignored_map,
    )
).execute_copy_request()

assert result.is_copy_successful

new_task_id = result.output_map["Task"][str(task.id)]
new_task = Task.objects.get(id=new_task_id)
assert new_task.counterparts.count() == 1

Here we are working with the same config as in the previous example, but now we have Counterpart with external_id 2 only in the Project1 and not in the Project2. And it’s here where CopyResult comes into play. We can see that the copy process failed because the counterpart with external_id 2 was not found in the destination project.

By observing the CopyResult.set_to_filter_map attribute, we can see that the counterpart with external_id 2 was not matched.

If it is happening in interactive context, you can prompt user to resolve this issue or accept the fact that some data won’t be copied.

If we want to confirm that the copy process should continue regardless, we can set the confirm_write attribute to True and pass the django_copyist.config.CopyResult.set_to_filter_map attribute to the CopyRequest object.

Note

The CopyRequest.set_to_filter_map is passed, so that Copyist can verify that list of unmatched values remained the same between copy calls. If it changed, unsuccessful result with reason DATA_CHANGED_STF will be returned.

Set to filter using custom function#

Sometimes you need to set value to the filtered value, but you need to do some custom logic to find the matching value and just few filters aren’t gonna cut it.

In this cases filter_func of FilterConfig comes in handy. Let’s see the example:

company = Company.objects.create(name="Company", address="Address")
project1 = Project.objects.create(name="Project1", company=company)
project2 = Project.objects.create(name="Project2", company=company)
counterpart1 = Counterpart.objects.create(
    name="Counterpart", external_id=1, project=project1
)
counterpart2 = Counterpart.objects.create(
    name="Counterpart", external_id=1, project=project2
)
employee = Employee.objects.create(name="Employee", company=company)
task = Task.objects.create(
    name="Task",
    description="Description",
    assignee=employee,
    project=project1,
)
task.counterparts.add(counterpart1)

def match_counterparts(
    model_config: "ModelCopyConfig",
    input_data: Dict[str, Any],
    field_name: str,
    field_copy_config: "FieldCopyConfig",
    set_to_filter_map: "SetToFilterMap",
    instance_list: List[Model],
    referenced_instance_list: List[Model],
) -> "FieldSetToFilterMap":
    original_counterparts = Counterpart.objects.filter(
        tasks__id__in=[task.id for task in instance_list],
    )
    new_counterparts = Counterpart.objects.filter(
        project_id=input_data["new_project_id"],
        external_id__in=[cp.external_id for cp in original_counterparts],
    )
    external_id_to_new_counterpart = {cp.external_id: cp for cp in new_counterparts}
    return {
        str(cp.id): (
            str(external_id_to_new_counterpart[cp.external_id].id)
            if cp.external_id in external_id_to_new_counterpart
            else None
        )
        for cp in original_counterparts
    }

config = ModelCopyConfig(
    model=Employee,
    filter_field_to_input_key={"id": "employee_id"},
    field_copy_actions={
        "name": TAKE_FROM_ORIGIN,
        "company": TAKE_FROM_ORIGIN,
        "tasks": MakeCopy(
            ModelCopyConfig(
                model=Task,
                field_copy_actions={
                    "name": TAKE_FROM_ORIGIN,
                    "description": TAKE_FROM_ORIGIN,
                    "project_id": FieldCopyConfig(
                        action=CopyActions.TAKE_FROM_INPUT,
                        input_key="new_project_id",
                    ),
                    "counterparts": FieldCopyConfig(
                        action=CopyActions.SET_TO_FILTER,
                        reference_to=Counterpart,
                        filter_config=FilterConfig(
                            filter_func=match_counterparts,
                        ),
                    ),
                },
            )
        ),
    },
)

result = Copyist(
    CopyRequest(
        config=CopyistConfig([config]),
        input_data={
            "employee_id": employee.id,
            "new_project_id": project2.id,
        },
        confirm_write=False,
    )
).execute_copy_request()

assert result.is_copy_successful
assert len(result.output_map["Employee"]) == 1
assert len(result.output_map["Task"]) == 1

new_task_id = result.output_map["Task"][str(task.id)]
new_task = Task.objects.get(id=new_task_id)
assert new_task.project_id == project2.id
assert new_task.counterparts.count() == 1
new_counterpart = new_task.counterparts.first()
assert new_counterpart.project_id == project2.id
assert new_counterpart.external_id == counterpart2.external_id
assert new_counterpart.id != counterpart1.id
assert new_counterpart.id == counterpart2.id

You can see that we defined function match_counterparts and use it for filtering. Although signature of the function is quite complex, it provides basically all available context, allowing you to write all logic you need to match the values. You can read more on signature at SetToFilterFunc

Ignoring models during copy with SET_TO_FILTER#

You probably noticed the CopyResult.ignored_map attribute in the previous examples. So how exactly it is used?

For example, lets assume you want to have same config as in SET_TO_FILTER example, but you want to ignore Task model if it can’t match all counterparts:

company = Company.objects.create(name="Company", address="Address")
project1 = Project.objects.create(name="Project1", company=company)
project2 = Project.objects.create(name="Project2", company=company)
counterpart1 = Counterpart.objects.create(
    name="Counterpart", external_id=1, project=project1
)
counterpart2 = Counterpart.objects.create(
    name="Counterpart 2", external_id=2, project=project1
)
counterpart3 = Counterpart.objects.create(
    name="Counterpart", external_id=1, project=project2
)
employee = Employee.objects.create(name="Employee", company=company)
task1 = Task.objects.create(
    name="Task",
    description="Description",
    assignee=employee,
    project=project1,
)
task1.counterparts.add(counterpart1, counterpart2)
task2 = Task.objects.create(
    name="Task 2",
    description="Description",
    assignee=employee,
    project=project1,
)
task2.counterparts.add(counterpart1)

config = ModelCopyConfig(
    model=Employee,
    filter_field_to_input_key={"id": "employee_id"},
    field_copy_actions={
        "name": TAKE_FROM_ORIGIN,
        "company": TAKE_FROM_ORIGIN,
        "tasks": MakeCopy(
            ModelCopyConfig(
                model=Task,
                ignore_condition=IgnoreCondition(
                    filter_conditions=[
                        IgnoreFilter(
                            filter_name="counterparts__id__in",
                            set_to_filter_field_name="counterparts",
                            set_to_filter_origin_model=Task,
                        )
                    ]
                ),
                field_copy_actions={
                    "name": TAKE_FROM_ORIGIN,
                    "description": TAKE_FROM_ORIGIN,
                    "project_id": FieldCopyConfig(
                        action=CopyActions.TAKE_FROM_INPUT,
                        input_key="new_project_id",
                    ),
                    "counterparts": FieldCopyConfig(
                        action=CopyActions.SET_TO_FILTER,
                        reference_to=Counterpart,
                        filter_config=FilterConfig(
                            filters={
                                "project_id": FieldFilterConfig(
                                    source=FilterSource.FROM_INPUT,
                                    key="new_project_id",
                                ),
                                "external_id": FieldFilterConfig(
                                    source=FilterSource.FROM_ORIGIN
                                ),
                            }
                        ),
                    ),
                },
            )
        ),
    },
)

result = Copyist(
    CopyRequest(
        config=CopyistConfig([config]),
        input_data={
            "employee_id": employee.id,
            "new_project_id": project2.id,
        },
        confirm_write=False,
    )
).execute_copy_request()

assert not result.is_copy_successful
assert result.reason == AbortReason.IGNORED

assert result.ignored_map[Task.__name__] == [task1.id]

result = Copyist(
    CopyRequest(
        config=CopyistConfig([config]),
        input_data={
            "employee_id": employee.id,
            "new_project_id": project2.id,
        },
        confirm_write=True,
        set_to_filter_map=result.set_to_filter_map,
        ignored_map=result.ignored_map,
    )
).execute_copy_request()

assert result.is_copy_successful

new_tasks = Task.objects.filter(project=project2)
assert len(new_tasks) == 1
assert new_tasks[0].name == task2.name

In this example, we have two tasks, but one of them doesn’t have all matching counterparts in the new project. Default behaviour in such case, that just not all counterparts will be matched and the copy process will continue.

But if you don’t want to copy model at all in such case - you can use IgnoreCondition with py:class:~.IgnoreFilter to ignore model if it doesn’t match the condition.

Ignoring based on nested data mismatches#

We can take the previous example further, and ignore the whole Employee model if any of the Task models counterparts is not matched.

company = Company.objects.create(name="Company", address="Address")
project1 = Project.objects.create(name="Project1", company=company)
project2 = Project.objects.create(name="Project2", company=company)
counterpart1 = Counterpart.objects.create(
    name="Counterpart", external_id=1, project=project1
)
counterpart2 = Counterpart.objects.create(
    name="Counterpart 2", external_id=2, project=project1
)
counterpart3 = Counterpart.objects.create(
    name="Counterpart", external_id=1, project=project2
)
employee = Employee.objects.create(name="Employee", company=company)
task1 = Task.objects.create(
    name="Task",
    description="Description",
    assignee=employee,
    project=project1,
)
task1.counterparts.add(counterpart1, counterpart2)
task2 = Task.objects.create(
    name="Task 2",
    description="Description",
    assignee=employee,
    project=project1,
)
task2.counterparts.add(counterpart1)

config = ModelCopyConfig(
    model=Employee,
    filter_field_to_input_key={"id": "employee_id"},
    ignore_condition=IgnoreCondition(
        filter_conditions=[
            IgnoreFilter(
                filter_name="tasks__counterparts__id__in",
                set_to_filter_field_name="counterparts",
                set_to_filter_origin_model=Task,
            )
        ]
    ),
    field_copy_actions={
        "name": TAKE_FROM_ORIGIN,
        "company": TAKE_FROM_ORIGIN,
        "tasks": MakeCopy(
            ModelCopyConfig(
                model=Task,
                field_copy_actions={
                    "name": TAKE_FROM_ORIGIN,
                    "description": TAKE_FROM_ORIGIN,
                    "project_id": FieldCopyConfig(
                        action=CopyActions.TAKE_FROM_INPUT,
                        input_key="new_project_id",
                    ),
                    "counterparts": FieldCopyConfig(
                        action=CopyActions.SET_TO_FILTER,
                        reference_to=Counterpart,
                        filter_config=FilterConfig(
                            filters={
                                "project_id": FieldFilterConfig(
                                    source=FilterSource.FROM_INPUT,
                                    key="new_project_id",
                                ),
                                "external_id": FieldFilterConfig(
                                    source=FilterSource.FROM_ORIGIN
                                ),
                            }
                        ),
                    ),
                },
            )
        ),
    },
)

result = Copyist(
    CopyRequest(
        config=CopyistConfig([config]),
        input_data={
            "employee_id": employee.id,
            "new_project_id": project2.id,
        },
        confirm_write=False,
    )
).execute_copy_request()

assert not result.is_copy_successful
assert result.reason == AbortReason.IGNORED

assert result.ignored_map[Employee.__name__] == [employee.id]

Notice how IgnoreCondition is used on the Employee model, where it defines exclude filter and information where to search for mismatches.

Custom ignore function#

But sometimes ignoring just based on SET_TO_FILTER is not enough and you want to bring in custom logic.

In this case you can use ignore_func of IgnoreCondition to define custom ignore logic.

company = Company.objects.create(name="Company", address="Address")
project1 = Project.objects.create(name="Project1", company=company)
project2 = Project.objects.create(name="Project2", company=company)
counterpart1 = Counterpart.objects.create(
    name="Counterpart", external_id=1, project=project1
)
counterpart2 = Counterpart.objects.create(
    name="Counterpart 2", external_id=2, project=project1
)
counterpart3 = Counterpart.objects.create(
    name="Counterpart", external_id=1, project=project2
)
employee = Employee.objects.create(name="Employee", company=company)
task1 = Task.objects.create(
    name="Task",
    description="Description",
    assignee=employee,
    project=project1,
)
task1.counterparts.add(counterpart1, counterpart2)
task2 = Task.objects.create(
    name="Task 2",
    description="Description",
    assignee=employee,
    project=project1,
)
task2.counterparts.add(counterpart1)

def ignore_tasks(
    model_config: "ModelCopyConfig",
    set_to_filter_map: "SetToFilterMap",
    model_extra_filter: Optional[Q],
    ignored_map: "IgnoredMap",
    input_data: Dict[str, Any],
) -> List[Model]:
    not_matched_counterparts = {
        key for key, value in set_to_filter_map[Task.__name__]["counterparts"].items() if value is None
    }
    query = Task.objects.filter(counterparts__id__in=not_matched_counterparts)
    if model_extra_filter:
        query = query.filter(model_extra_filter)
    return list(query)

config = ModelCopyConfig(
    model=Employee,
    filter_field_to_input_key={"id": "employee_id"},
    field_copy_actions={
        "name": TAKE_FROM_ORIGIN,
        "company": TAKE_FROM_ORIGIN,
        "tasks": MakeCopy(
            ModelCopyConfig(
                model=Task,
                ignore_condition=IgnoreCondition(
                    ignore_func=ignore_tasks,
                ),
                field_copy_actions={
                    "name": TAKE_FROM_ORIGIN,
                    "description": TAKE_FROM_ORIGIN,
                    "project_id": FieldCopyConfig(
                        action=CopyActions.TAKE_FROM_INPUT,
                        input_key="new_project_id",
                    ),
                    "counterparts": FieldCopyConfig(
                        action=CopyActions.SET_TO_FILTER,
                        reference_to=Counterpart,
                        filter_config=FilterConfig(
                            filters={
                                "project_id": FieldFilterConfig(
                                    source=FilterSource.FROM_INPUT,
                                    key="new_project_id",
                                ),
                                "external_id": FieldFilterConfig(
                                    source=FilterSource.FROM_ORIGIN
                                ),
                            }
                        ),
                    ),
                },
            )
        ),
    },
)

result = Copyist(
    CopyRequest(
        config=CopyistConfig([config]),
        input_data={
            "employee_id": employee.id,
            "new_project_id": project2.id,
        },
        confirm_write=False,
    )
).execute_copy_request()

assert not result.is_copy_successful
assert result.reason == AbortReason.IGNORED

assert result.ignored_map[Task.__name__] == [task1.id]

result = Copyist(
    CopyRequest(
        config=CopyistConfig([config]),
        input_data={
            "employee_id": employee.id,
            "new_project_id": project2.id,
        },
        confirm_write=True,
        set_to_filter_map=result.set_to_filter_map,
        ignored_map=result.ignored_map,
    )
).execute_copy_request()

assert result.is_copy_successful

new_tasks = Task.objects.filter(project=project2)
assert len(new_tasks) == 1
assert new_tasks[0].name == task2.name

Here we defined function ignore_tasks and use it for ignoring Task models, based on same logic, as in previous examples.

You can read more on signature at IgnoreFunc

Controlling querysets with static filters#

Sometimes - there is data that you want to ignore, but it’s not based on input or the data itself, but on some static condition. For example if you want to all entities with certain status or something like that.

Copyist allows you to do it, using static_filter of ModelCopyConfig.

company = Company.objects.create(name="Company", address="Address")
employee = Employee.objects.create(name="Employee", company=company)
employee2 = Employee.objects.create(name="Employee 2 [FIRED]", company=company)

config = ModelCopyConfig(
    model=Company,
    filter_field_to_input_key={"id": "company_id"},
    field_copy_actions={
        "name": TAKE_FROM_ORIGIN,
        "address": TAKE_FROM_ORIGIN,
        "employees": MakeCopy(
            ModelCopyConfig(
                model=Employee,
                static_filters=~Q(name__icontains="[FIRED]"),
                field_copy_actions={
                    "name": TAKE_FROM_ORIGIN,
                },
            )
        ),
    },
)
result = Copyist(
    CopyRequest(
        config=CopyistConfig([config]),
        input_data={"company_id": company.id},
        confirm_write=False,
    )
).execute_copy_request()

assert result.is_copy_successful
assert len(result.output_map["Company"]) == 1
assert len(result.output_map["Employee"]) == 1

new_company_id = result.output_map["Company"][str(company.id)]
new_employees = Employee.objects.filter(company_id=new_company_id)
assert len(new_employees) == 1
new_employee = new_employees[0]
assert new_employee.name == employee.name

static_filters is a Q object that will be used to filter the queryset of the model.

Data preparation by deletion#

There are cases, usually when you copy from one existing context to another, where you need to clean up obsolete data at destination context, before moving fresh copies there.

In this case you can use data_preparation_steps of ModelCopyConfig.

company = Company.objects.create(name="Company", address="Address")
project1 = Project.objects.create(name="Project1", company=company)
project2 = Project.objects.create(name="Project2", company=company)
counterpart11 = Counterpart.objects.create(
    name="11", external_id=1, project=project1
)
counterpart12 = Counterpart.objects.create(
    name="12", external_id=2, project=project1
)
counterpart21 = Counterpart.objects.create(
    name="21", external_id=1, project=project2
)
counterpart23 = Counterpart.objects.create(
    name="23", external_id=3, project=project2
)

config = ModelCopyConfig(
    model=Counterpart,
    filter_field_to_input_key={"project_id": "source_project_id"},
    data_preparation_steps=[
        DataPreparationStep(
            action=DataModificationActions.DELETE_BY_FILTER,
            filter_field_to_input_key={"project_id": "new_project_id"},
        )
    ],
    field_copy_actions={
        "name": TAKE_FROM_ORIGIN,
        "external_id": TAKE_FROM_ORIGIN,
        "project_id": FieldCopyConfig(
            action=CopyActions.TAKE_FROM_INPUT,
            input_key="new_project_id",
        ),
    },
)
result = Copyist(
    CopyRequest(
        config=CopyistConfig([config]),
        input_data={
            "source_project_id": project1.id,
            "new_project_id": project2.id,
        },
        confirm_write=False,
    )
).execute_copy_request()

assert result.is_copy_successful

new_counterparts = list(
    Counterpart.objects.filter(project_id=project2.id).values_list(
        "name", flat=True
    )
)
assert set(new_counterparts) == {"11", "12"}

Here we have used preparation step to delete all Counterpart models that are linked to the destination project, so that we can copy fresh data there.

Data preparation with custom function#

But just deleting data could be not what you want to prepare you data.

In this case you can use func of DataPreparationStep, which allows better control over what is going on

company = Company.objects.create(name="Company", address="Address")
project1 = Project.objects.create(name="Project1", company=company)
project2 = Project.objects.create(name="Project2", company=company)
counterpart11 = Counterpart.objects.create(
    name="11", external_id=1, project=project1
)
counterpart12 = Counterpart.objects.create(
    name="12", external_id=2, project=project1
)
counterpart21 = Counterpart.objects.create(
    name="21", external_id=1, project=project2
)
counterpart23 = Counterpart.objects.create(
    name="23", external_id=3, project=project2
)

def prepare_destination_project(
    model_config: "ModelCopyConfig",
    input_data: Dict[str, Any],
    set_to_filter_map: "SetToFilterMap",
    output_map: "OutputMap",
) -> None:
    original_external_ids = Counterpart.objects.filter(
        project_id=input_data["source_project_id"]
    ).values_list("external_id", flat=True)

    Counterpart.objects.filter(
        project_id=input_data["new_project_id"],
        external_id__in=original_external_ids,
    ).delete()

config = ModelCopyConfig(
    model=Counterpart,
    filter_field_to_input_key={"project_id": "source_project_id"},
    data_preparation_steps=[
        DataPreparationStep(
            action=DataModificationActions.EXECUTE_FUNC,
            func=prepare_destination_project,
        )
    ],
    field_copy_actions={
        "name": TAKE_FROM_ORIGIN,
        "external_id": TAKE_FROM_ORIGIN,
        "project_id": FieldCopyConfig(
            action=CopyActions.TAKE_FROM_INPUT,
            input_key="new_project_id",
        ),
    },
)
result = Copyist(
    CopyRequest(
        config=CopyistConfig([config]),
        input_data={
            "source_project_id": project1.id,
            "new_project_id": project2.id,
        },
        confirm_write=False,
    )
).execute_copy_request()

assert result.is_copy_successful

new_counterparts = list(
    Counterpart.objects.filter(project_id=project2.id).values_list(
        "name", flat=True
    )
)
assert set(new_counterparts) == {"11", "12", "23"}

Here we not just deleted all Counterpart models in destination project, but only those that have overlapping external_id with source project counterparts.

You can see signature of the function at DataPreparationFunc

Post-copy actions#

Similar to data preparation steps, you can define post-copy actions, that will be executed after all data is copied.

company = Company.objects.create(name="Company", address="Address")
project1 = Project.objects.create(name="Project1", company=company)
project2 = Project.objects.create(name="Project2", company=company)
counterpart11 = Counterpart.objects.create(
    name="11", external_id=1, project=project1
)
counterpart12 = Counterpart.objects.create(
    name="12", external_id=2, project=project1
)

def delete_copied_data_in_source(
    model_config: "ModelCopyConfig",
    input_data: Dict[str, Any],
    set_to_filter_map: "SetToFilterMap",
    output_map: "OutputMap",
    copy_intent_list: "List[CopyIntent]",
) -> None:
    copied_id_list = [intent.origin.id for intent in copy_intent_list]
    Counterpart.objects.filter(id__in=copied_id_list).delete()

config = ModelCopyConfig(
    model=Counterpart,
    filter_field_to_input_key={"project_id": "source_project_id"},
    postcopy_steps=[
        PostcopyStep(
            action=DataModificationActions.EXECUTE_FUNC,
            func=delete_copied_data_in_source,
        )
    ],
    field_copy_actions={
        "name": TAKE_FROM_ORIGIN,
        "external_id": TAKE_FROM_ORIGIN,
        "project_id": FieldCopyConfig(
            action=CopyActions.TAKE_FROM_INPUT,
            input_key="new_project_id",
        ),
    },
)
result = Copyist(
    CopyRequest(
        config=CopyistConfig([config]),
        input_data={
            "source_project_id": project1.id,
            "new_project_id": project2.id,
        },
        confirm_write=False,
    )
).execute_copy_request()

assert result.is_copy_successful

new_counterparts = list(
    Counterpart.objects.filter(project_id=project2.id).values_list(
        "name", flat=True
    )
)
assert set(new_counterparts) == {"11", "12"}
assert Counterpart.objects.filter(project_id=project1.id).count() == 0

Here we used it to delete data in source project after it was copied to destination project.

It also could be used to recalculate some computed values based on new context, or to send some notifications, or whatever you need.

You can see more on signature at PostcopyFunc

Also PostcopyStep supports filter_field_to_input_key with DELETE_BY_FILTER as well.

Note on ordering of operations#

Now that we are familiar with different steps that Copyist can go through in process, let’s talk about the order of operations.

The order of operations is as follows:

  1. Validation of whole configuration, it can raise exceptions if same model is defined several times, or in case of other configuration errors

  2. Forming set_to_filter_map across included all models

  3. Resolving ignore_condition for all models

  4. If confirm_write is False

  5. If ignored_map is not empty - abort with IGNORED reason

    1. If set_to_filter_map is not empty - abort with NOT_MATCHED reason

    2. If data_preparation_steps are defined - execute them

  6. Go through field copy actions

    1. If there are nested model configs for fields resolving - they execute starting from step 5

  7. If compound_copy_actions are defined - execute them from step 5

  8. If postcopy_steps are defined - execute them

Note on performance#

Even though django-copyist was built with performance in mind, it’s still a tool that has to do a lot of work, especially when we are talking about dozens of thousands of records to copy.

So, if you are copying a lot of data, you should consider doing these copies as background tasks, probably using some kind of task queue like Celery.

Format of CopyRequest and CopyResult are designed to be easily serializable, so that you can reflect process of copying in your database.