Importing specific fields with overwrite_properties

While I had planned to stretch out my posts related to the "Acme" project, there are currently some people with questions about using overwrite_properties - so, I've moved this post forward.

By default, migration treats the source data as the system of record - that is, when reimporting previously-imported records, the expectation is to completely replace the destination side with fresh source data, discarding any interim changes which might have been made on the destination side. However, sometimes, when updating you may want to only pull specific fields from the source, leaving others (potentially manually-edited) intact. We had this situation with the event feed - in particular, the titles received from the feed may need to be edited for the public site. To achieve that, we used the overwrite_properties property on the destination plugin:

destination:
  plugin: 'entity:node'
  overwrite_properties:
    - 'field_address/address_line1'
    - 'field_address/address_line2'
    - 'field_address/locality'
    - 'field_address/administrative_area'
    - 'field_address/postal_code'
    - field_start_date
    - field_end_date
    - field_instructor
    - field_location_name
    - field_registration_price
    - field_remaining_spots
    - field_synchronized_title

When overwrite_properties is present, nothing changes when importing a new entity - but, if the destination entity already exists, the existing entity is loaded, and only the fields and properties enumerated in overwrite_properties will be, well, overwritten. In our example, note in particular field_synchronized_title - on initial import, both the regular node title and this field are populated from ClassName, but on updates only field_synchronized_title receives any changes in ClassName. This prevents any unexpected changes to the public title, but does make the canonical title from the feed available should an editor care to review and decide whether to modify the public title to reflect any changes.

Now, in this case we are creating the entities initially through this migration, and thus we know via the map table when a previously-migrated entity is being updated and thus overwrite_properties should be applied. Another use case is when the entire purpose of your migration is to update specific fields on pre-existing entities (i.e., not created by this migration). In this case, you need to map the IDs of the entities that are to be updated, otherwise the migration will simply create new entities. So, if you had a "nid_to_update" property in your source data, you would include

process:
  nid: nid_to_update

in your migration configuration. The destination plugin will then load that existing node, and only alter the specifies overwrite_properties in it.

Use the Twitter thread below to comment on this post: