Universal Import Service 3.0.1-SNAPSHOT

The Universal Import Service is used to import data into arveo from third party systems. Currently, the service supports reading data from CSV files and from the filesystem.

The service is based on Apache Camel and monitors configurable directories for files to import. It provides plugin interfaces for CsvLineMappers and FileMappers. CsvLineMappers are used to create one or more arveo entities from one line in a CSV file. FileMappers do the same for arbitrary files. A highly configurable default mapper implementation for both use cases is provided, which should be sufficient for most import scenarios.

Configuration

The service offers some generic settings that apply to all mapping configurations. Because the service is based on Apache Camel, several configuration options for the Camel components apply.

The Camel routes used to import files use a URI-parameter to configure the start of the route. This makes it possible to select and configure the Camel component for the start of the route by URI scheme.

A URI must be configured for each directory to import files from. The keys in the configuration maps define the IDs of the Camel routes. In the following example, three CSV import routes and one file import route are configured.

configuring the URIs for CSV imports

universal-import-service:
  csv:
    csv-import-1:
      uri: "file://${project.build.testOutputDirectory}/csv?antInclude=**/test-Demo.csv&noop=false"
    csv-import-2:
      uri: "file://${project.build.testOutputDirectory}/csv?antInclude=**/test-100.csv&noop=false"
    csv-import-3:
      uri: "file://${project.build.testOutputDirectory}/csv?antInclude=**/test-100_2.csv&noop=false"

configuring the URI for file imports

universal-import-service:
  file:
    file-import-1:
      uri: "file://${project.build.testOutputDirectory}/content?noop=true&moveFailed=.error"

In the example, the file component of Apache Camel is activated by the file: scheme of the URI. The Camel documentation contains information about the available parameters for the file endpoint as well as the other available endpoints.

The way CSV files are processed is controlled by the CSV data format of Apache Camel. It offers various configuration properties, that are listed in the Camel documentation. The properties can be used in the configuration file for the Universal Import Service as shown below:

configing the CSV data format

camel:
  dataformat:
    csv:
      delimiter: ";"

In the example, the delimiter for the CSV columns is set to ;.

The configuration settings for the CSV data format apply to all CVS import routes.

Generic CSV mapper

The generic CSV mapper is the default implementation of the CsvLineMapper interface contained in the service. The mapper maps each column in the CSV file to an attribute of an arveo entity. Content to import is read from a configurable column, which can contain zero or more file names to import. The file names can either contain a fully qualified path, or just the name of the file. In the later case, the directory containing the files with the actual content can be configured (see below).

The mapper offers three different modes:

SIMPLE: This is the default. Each line in the CSV file is mapped to one arveo document, which might contain zero or more content elements. The mapping of files to content elements is fixed and can either map file names to content element names or positions in the list of files to content element names.
COUNTING: Each line is mapped to one or more arveo documents. Each document contains either one content element or no content at all. A counter can be used as prefix or suffix for any imported attribute to distinguish the documents created for one line.
REFERENCING: Each line is mapped as a simple record structure consisting of a container entity containing all attributes and zero or more document components referenced by a foreign key, each containing one content element.

The generic CSV mapper supports individual configurations for each configured import route as shown in the following example:

configuring the generic mapper

universal-import-service:
  csv:
    csv-import-1: (1)
      uri: "file://..."
  generic-csv-mapper:
    settings:
      csv-import-1: (2)
        type-definition-name: "demo_document"

1	The ID of the import route for the configured directory
2	The ID of the configuration for the generic CSV mapper. Must match the ID of the import route.

Attribute mapping

The mapping of CSV columns to arveo attributes works the same in each mode. An attribute mapping must be configured for each CSV column that is supposed to be imported. Attribute mappings are configured in a map, the keys being the names of the columns of the CSV file. An attribute mapping consists of the following parameters:

Table 1. attribute mapping parameters
Parameter	Explanation
attribute-name	The name of the arveo attribute (in snake-case)
type	The type of the attribute (`SHORT`, `INTEGER`, `LONG`, `DOUBLE`, `STRING`, `BOOLEAN`, `DATE_TIME`, `DATE` or `TIME`)
array	Whether the attribute is multivalued or not (the default is false).
delimiter	The delimiter of multivalued attributes. Ignored when `array` is set to false.
date-pattern	The pattern used to parse attributes of type `DATE`.
time-pattern	The pattern used to parse attributes of type `TIME`.
date-time-pattern	The pattern used to parse attributes of type `DATE_TIME`.
zone-id	The time zone ID used when the value in the CSV column for `DATE_TIME` attributes does not contain time zone information.
local-date	The local data for a date-time attribute used if the value does not contain a date. Must be in ISO-8601 format such as '2011-12-03'.
local-time	The local time for a date-time attribute used if the value does not contain a time. Must be in ISO-8601 format such as '10:15' or '10:15:30'
prefix	An optional prefix added to imported attributes of type `STRING`.
suffix	An optional suffix added to imported attributes of type `STRING`.
default-value	The default value used when the line does not contain a value for a configured attribute mapping. Must follow the same format rules as the other values in the column.
number-format.patterm	Format pattern for numeric values.
number-format.grouping-separator	The grouping separator character.
number-format.decimal-separator	The decimal separator character.
unique-identifier	Marks the attribute as a unique identifier or as part of a combined unique identifier that will be used to check if the entity already exists.

Attributes are parsed using the configures number format settings for numeric values, or using the supplied patterns for date, time or date-time values. Booleans can either be Strings ('true', 'false') or integers (0,1).

The settings for numeric values allow configuring group- and decimal-separators as well as a (non-localized) pattern for numeric values. The default pattern #,##0.### requires one or more digits in front of the decimal separator and up to three digits behind the decimal separator. The default grouping separator is ,. The default decimal separator is .. Note that the separators in the pattern are not localized, e.g. they have to be the default separators for the english locale.

Empty values (null, an empty string or a string containing only whitespace) are mapped to null. A default value to return instead can be configured in the attribute mapping. The default value is a string in the same format as expected in the original value.

The default date-time-pattern used by the importer for date-time attributes ([u-M-d]['T'][H:m:s][X]) allows parsing partial values for date-time attributes. The parts in square brackets (the date, the letter 'T', the time and the zone) are optional. If no value for these parts is contained in the parsed value, it is replaced by the configured default local-date, local-time or zone-id.

The partial default values local-date, local-time and zone-id will not be used for empty values. Use the default-value instead to configure a default for empty values.

Attribute mappings are configured for each mapper mode. For example, when the counting mode is used, the attribute mappings would be configured in the setting universal-import-service.generic-csv-mapper.counting.attributes.

The example below shows a mapping configuration for the import of CSV columns called 'sysrowid', 'systimestamp' and 'ispdf'.

attribute mapping

attributes:
  sysrowid:
    attribute-name: "sys_row_id"
    type: STRING
    unique-identifier: true
  systimestamp:
    attribute-name: "sys_time_stamp"
    type: DATE_TIME
    date-time-pattern: "u-M-d H:m:s"
    zone-id: "UTC"
  ispdf:
    attribute-name: "pdf"
    type: BOOLEAN
  archive:
    attribute-name: "archive"
    type: STRING
    default-value: "records"

Attributes using a default value do not have to be contained in the CSV file. This makes it possible to add new attributes that were not contained in the original data.

null values

When an entity is updated, it might be necessary to actively set an existing attribute value in the archive to null. This can be achieved by setting the value of the attribute in the imported data to the configured attribute-null-value. The default attribute null value is $null.When this value is used as a default-value in an attribute mapping, attributes not present in the imported data will be set to null.

Unique identifier checks

When an attribute mapping uses the unique-identifier: true setting, it will be used to check if the entity already exists in the archive before the line is processed.When multiple attribute mappings are marked as unique identifiers, all attribute values are compared to check for existence.There are some limitations to keep in mind when this feature is used:

Each check requires an additional request that performs a query. For performance reasons, the existence checks should be disabled for the initial import.
The referencing mode will only check for existing entities in the container type definition.
The counting mode still needs to preprocess the uploaded content to determine the number of entities to create before the existence checks can be done.

Simple mode

The simple mode is the default operating mode of the generic CSV mapper. In this mode, a 1:1 mapping between file names read from the CSV file and content element names must be configured. The mapping can either be from file name to content element name or from the position of the file name in the list to a content element name. Because the simple mode is the default, it does not habe to be explicitly enabled in the configuration.

simple mode configuration

generic-csv-mapper:
  settings:
    csv-import-1: (1)
      type-definition-name: "demo_document" (2)
      conflict-resolution-strategy: UPDATE_ATTRIBUTES
      skip-empty-documents: true
      simple:
        content:
          csv-field-name: "filename" (3)
          content-path: "${project.build.testOutputDirectory}/content" (4)
          position-mappings:
            0: "content" (4)

1	The name of the import configuration. Must match the name of the configured route (see configuration])
2	The name of the type definition that will contain the imported documents
3	The name of the field in the CSV file containing the file names
4	The path of the directory that contains the files. In this case, the CSV is expected to contain only the file names.
5	Mapping by position. The first file will be stored in the content element named "content".

A complete configuration example can be found in the system-test module in the file src/test/resource-templates/config/universal-import-service.yaml.

Counting mode

In the counting mode, CSV lines containing more than one filename are mapped to multiple independent document entities. Each document entity will contain one content element. If a line in the CSV file does not contain any file names, one document with no content elements will be created. A counter can be added to the suffix or prefix of any string attribute by using the placeholder $+{contentElementNumber}+.

counting mode configuration

generic-csv-mapper:
  settings:
    csv-import-2:
      type-definition-name: "document" (1)
      skip-empty-documents: true
      mode: COUNTING (2)
      counting:
        content:
          csv-field-name: "filename" (3)
          delimiter: "," (4)
          content-path: "${project.build.testOutputDirectory}/content" (5)
        attributes:
          xhdoc:
            attribute-name: "xhdoc"
            type: STRING
            suffix: "_${contentElementNumber}" (6)
            unique-identifier: true

1	The name of the type definition that will contain the imported documents
2	The counting mode must be enabled explicitly
3	The name of the field in the CSV file containing the file names
4	The delimiter used to separate file names
5	The path of the directory that contains the files. In this case, the CSV is expected to contain only the file names.
6	Adds a suffix with the counter (starting at 1) of the file

A complete configuration example can be found in the system-test module in the file src/test/resource-templates/config/universal-import-service-counting.yaml.

Referencing mode

In the referencing mode, a record container is created for each imported document. This record will contain all attributes, but no content. For each imported file, a document is created that contains only the data of the imported file. The documents are referenced by a foreign key containing the ID of the record container.

The imported documents do not contain any custom attributes, but arveo’s inheritance feature can be used to automatically inherit attributes from the referenced record.

referencing mode configuration

generic-csv-mapper:
  settings:
    csv-import-3:
      type-definition-name: "component" (1)
      conflict-resolution-strategy: UPDATE_ATTRIBUTES
      skip-empty-documents: true
      mode: REFERENCING (2)
      referencing:
        container-type-definition-name: "container" (3)
        reference-field-name: "container_id" (4)
        content:
          csv-field-name: "filename" (5)
          delimiter: "," (6)
          content-path: "${project.build.testOutputDirectory}/content" (7)

1	The name of the type definition that will contain the imported documents
2	The referencing mode must be enabled explicitly
3	The name of the type definition containing the record containers
4	The name of the attribute in the documents containing the foreign key
5	The name of the field in the CSV file containing the file names
6	The delimiter used to separate file names
7	The path of the directory that contains the files. In this case, the CSV is expected to contain only the file names.

A complete configuration example can be found in the system-test module in the file src/test/resource-templates/config/universal-import-service-referencing.yaml.

Configuring the foreign key

The foreign key links the document entities to a container entity. As shown above, the reference field name in the document entities must be configured. The field in the container entity referenced by the foreign key is, by default, the system field id. The value will be retrieved automatically by the batch processing API of the content repository service. It is possible to change the name of the referenced field using the property reference-target-field-name. It is also possible to disable the automatic retrieval of the foreign key value by configuring no value for the reference-field-name property. The value must be provided by the attribute mappings in this case.

Document attributes

By default, the document does not have any attributes except the required system fields and the reference field. In some cases, the document type definition might require additional custom fields. Those can be configured by providing additional attribute mappings for the document entities as shown below:

configuring document attribute mappings

generic-csv-mapper:
  settings:
    csv-import-3:
      mode: REFERENCING
      referencing:
        document-attributes:
          docid:
            attribute-name: "doc_id"
            type: STRING
          contentRep:
            attribute-name: "repository_id"
            type: STRING
          componentName:
            attribute-name: "component_id"
            type: STRING
            default-value: "data_${documentNumber}"

The generic CSV mapper provides one placeholder $+{documentNumber}+ that contains the document’s number (starting at 1) and that can be used either in a prefix or suffix or in a default value.

Content preprocessing

The generic CSV mapper provides an additional extension mechanism that can be used to preprocess the binary content for each individual line in the CSV file before it is handed over to the mapper. This way it is possible to perform tasks like decryption, conversion or merging of content.

The extension mechanism works by registering a bean of type de.eitco.uis.mappers.common.ContentPreprocessor (for example, by creating a custom Spring Boot starter). Each ContentPreprocessor must implement the preprocess method, which has the following parameters:

fileNames: A list of file names that was parsed from the current line in the CSV file (never null)
csvFilePath: The directory that contains the CSV file (never null)
contentPath: The configured path used to find the actual files (might be null)

The ContentPreprocessor returns a list of PreprocessedContent instances, which consist of an InputStream, the file name that will be used in arveo and the position of the file in the original list of file names. How many PreprocessedContent instances are contained in the returned list depends on the preprocessor. For example, an implementation might take all files and merge them to a single PDF file. Other implementations might just wrap the returned streams for decryption. Custom implementation can extend the class de.eitco.uis.mappers.common.AbstractContentPreprocessor which contains a utility method to open new FileInputStreams.

It is possible to register several ContentPreprocessor beans. The processor used for a specific import route is selected by calling the processor’s usedForRoute method with the ID of the route. The first processor found that returns true in this method will be used.

The interfaces to implement as well as the other classes required to implement a ContentPreprocessor are contained in the following artifact:

<dependency>
    <groupId>de.eitco.uis</groupId>
    <artifactId>universal-import-mappers-common</artifactId>
    <version>${import.service.version}</version>
</dependency>

To enable custom extensions, the Jar containing the Spring Boot Starter (and any additional Jar, if required) must be placed in the directory configured as loader path of the service using the -Dloader.path parameter.

Attribute preprocessing

Similar to the content preprocessing plugin mechanism, there is another extension mechanism that can be used to perform custom preprocessing actions on the mapped attributes.

The extension mechanism works by registering a bean of type de.eitco.uis.mappers.common.AttributesPreprocessor (for example, by creating a custom Spring Boot starter). Each AttributesPreprocessor must implement the preprocessAttributes method, which has the following parameters:

attributes: The attributes that were resolved by the configured mappings
line: The original line in the CSV being processed
contentUploadMap: A map containing the resolved content uploads (after being preprocessed)
csvFilePath: The path of the current CSV file
typeDefinitionName: The name of the type definition the attributes will be used for

The AttributesPreprocessor must return a map containing all attributes that are supposed to be added to the created entity. It can remove, add or edit any attribute already present in the map.

It is possible to register several AttributesPreprocessor beans. The processor used for a specific import route is selected by calling the processor’s usedForRoute method with the ID of the route. The first processor found that returns true in this method will be used.

The interfaces to implement as well as the other classes required to implement a AttributesPreprocessor are contained in the following artifact:

<dependency>
    <groupId>de.eitco.uis</groupId>
    <artifactId>universal-import-mappers-common</artifactId>
    <version>${import.service.version}</version>
</dependency>

Conflict resolution strategies

The Universal Import Service features several different strategies to resolve conflicts with already existing entities.

The strategy to use can be selected by configuring the following parameter.The default strategy is SKIP.

conflict resolution strategy for the simple mapper mode

universal-import-service:
  generic-csv-mapper:
    settings:
      csv-import-1:
        simple:
          conflict-resolution-strategy: UPDATE_ATTRIBUTES

conflict resolution strategy for the counting mapper mode

universal-import-service:
  generic-csv-mapper:
    settings:
      csv-import-2:
        mode: COUNTING
        counting:
          conflict-resolution-strategy: UPDATE_ATTRIBUTES

conflict resolution strategy for the referencing mapper mode

universal-import-service:
  generic-csv-mapper:
    settings:
      csv-import-3:
        mode: REFERENCING
        referencing:
          conflict-resolution-strategy: UPDATE_ATTRIBUTES

The following strategies are available:

SKIP: The default strategy. Existing entities will be skipped.
UPDATE: Existing entities will be updated.
UPDATE_ATTRIBUTES: Attributes of existing entities will be updated. The content of documents will be ignored.
OVERWRITE: Existing entities will be overwritten.
OVERWRITE_ATTRIBUTES: Attributes of existing entities will be overwritten. The content of documents will be ignored.

Overwriting existing entities must be enabled in the type definition in use.

Except for the default SKIP strategy, all other strategies require at least one unique-identifier attribute. The unique identifier attributes will be used to create a selector passed on to the Content Repository Service. The Content Repository Service will use this selector to perform an update on the existing entity.

Importing from CSF files

The Universal Import Service can be used to import data from Saperion CSF files. CSF files are basically CSV files with a proprietary header containing configuration for the Saperion importers. The Universal Import Service can use some of the settings read from the header. The reading of the actual data from the CSF file is done using the same mechanism as is used for generic CSV files. So the same configuration options, especially for attribute mappings, can be used for CSF files, too.

Settings from CSF headers

The following information is read from the CSF headers:

separator: The separator used in the body of the CSF file.
date pattern: The format of date values in the body of the CSF file
time pattern: The format of time values in the body of the CSF file
date-time pattern: The format of timestamps in the body of the CSF file
lagacy-import-mode: This controls the behavior of the old import mechanism in the Saperion Rich Client or Core Server.
duplicate handling: This controls how the Saperion Universal Importer handles duplicate documents.
document handling: This controls how the Saperion Universal Importer treats actual binary content of duplicate documents.
property handling: This controls how the Saperion Universal Importer treats attributes of duplicate documents.
DDC name: The name of the Saperion DDC to import data into.
key field names: Names of fields that can be used to identify duplicate documents.

Except for the separator, all of these fields in the header are optional.

The Universal Import Service uses the separator and the patterns for date, time and date-time values (if present) to read the actual data from the CSF file. The key field names are used to identify potential conflicts with already existing entities. The DDC name can be used in the configuration (see below) to determine the name of the type definition to import into.

The Universal Import Service maps the settings for the legacy import mode and the duplicate, document and property handling to the conflict resolution strategies of the generic CSV mapper. Due to the differences in the data model between Saperion and _arveo, not all of these settings are supported.

Legacy import mode

The legacy import mode setting can be mapped to the conflict resolution strategies. Both modes are supported:

U (update-only): Only entities that are already present in the archive will be updated. No new entities will be created.
A (create-or-update): Existing entities will be updated, new entities will be created.

This will work only when the key field names are not empty.

Duplicate handling

Because arveo enforces uniqueness of entities, only the REVISE and REVISEONLY modes are supported.

REVISE (create-or-update): Existing entities will be updated, new entities will be created.
REVISEONLY (update-only): Only entities that are already present in the archive will be updated. No new entities will be created.

Document handling

In difference to Saperion, arveo does not support arbitrary content elements for one document entity. Some of the behavior of the Saperion Universal Importer can be reproduced using the different importing modes of the generic CSV mapper:

SIMPLE: In the simple mode, the number of possible content elements is fixed. The importer will update existing content elements, but new ones cannot be appended.
COUNTING: In this mode, the importer will create seprate documents for each content element. This makes it possible to update existing ones and to create new entities for additional content elements.
REFERENCING: In this mode, content elements are stored as separate document entities which are linkes to a common container entity using foreign keys. This mode will append new content elements and replace existing ones when they can be identified by a unique key.

The REPLACEALL mode of the Saperion Universal Importer is not supported.

Property handling

The setting for the property handling mode is ignored.By default, the Universal Import Service will behave as if the UPDATE setting is active.However, it is possible to reproduce the OVERWRITE behavior by using the attriute-null-value as a default value for attributes not present in the body of the CSF file.

Configuring CSF imports

CSF imports are configured by defining the file system resource to monitor for CSF files as shown below:

universal-import-service:
  csf:
    csf-import-1:
      uri: "file://${project.build.testOutputDirectory}/csf?antInclude=**/csftest1.csf&noop=false"
    csf-import-2:
      uri: "file://${project.build.testOutputDirectory}/csf?antInclude=**/csftest4.csf&noop=false"
    csf-import-3:
      uri: "file://${project.build.testOutputDirectory}/csf?antInclude=**/csftest3.csf&noop=false"
  type-name-mappings: (1)
    AMV_TBL: incident

1	This setting can be used to configure a mapping from DDC names read from the CSF header to type definition names

Writing a custom line mapper

Custom line mappers have to implement the interface CsvLineMapper. Mappers can use the typed or the generic API of arveo. A mapper that uses the generic API has to return true in the isGeneric method implementation and has to implement the mapLineGeneric method. Typed mappers have to return false in the isGeneric method and have to implement the mapLine method.

The custom mapper implementation has to be registered as a Spring bean in a custom Spring Boot starter. To replace the provided default mapper, the custom auto starter either has to run before the auto configuration class de.eitco.uis.mappers.csv.GenericCsvMapperAutoConfiguration or the default mapper bean registrations have to be disabled by setting the property universal-import-service.generic-csv-mapper.enabled to false.

Line mappers return a list of batch operations. The arveo entities created for one line can be created using the respective batch operation(s). The operations will be executed in the order in which they are contained in the list.

The custom mapper can be activated by adding the jar of the custom starter to the service’s libs directory configured by the parameter -Dloader.path.

Generic file mapper

The generic file mapper is the default implementation of the FileMapper interface contained in the service. It can import files from a configurable directory into arveo. Properties of the files, for example parts of the path of file name, can be extracted as attributes of the arveo entities.

The mapper operates in two phases:

Phase 1: Collect properties of the file to import. These can be parts of the path, the length or type of the file.
Phase 2: Map collected properties to arveo attributes.

Property collection

Which properties to collect is configurable. Properties can be extracted from the path and file name by position or by using a regular expression. The path is split using the system’s path separator. The file name can be split using a configurable separator. The value of each property is a string that can be mapped to an attribute using the attribute mappings described below.

Positional properties

Positional properties are collected from the path or file name, after it is split into an array of strings either by using the path separator or the configured file name separator character. The position can be given as a positive or negative integer. A positive integer (starting at 1) defines the position from left to right. A negative integer (starting at -1) defines the position from right to left. For example, in the path /path/to/my/file.txt, the position 1 would match path and the position -2 would match my. Likewise, when the file name is used to collect a property and the file name separator is configured as _, the position 2 in the file name invoice_20250982.pdf would match 20250982 (the extension is stripped from the file name by default).

Regular expression properties

Properties can be extracted from the path or file name using a regular expression. The expression can use capturing groups. The number of the group that contains the property can be configured. For example, the regular expression [-,_]([A-Z]{2})$ could be used to parse the language DE as two upper-case characters at the end of the file name EITCO_arveo-secom_Produktportfolio_DE.pdf. The extension is removed from the filename by default before the expression is matched against the file name.

Property mapping

Properties to collect are configured using a map, where the keys are the names of the properties. These names can then be used in the attribute mappings. Each map entry can contain the following settings. Note that only one way to collect a property can be used for each individual property.

Table 2. property mapping options
Option	Explanation
position-in-path	Position in the path split by path separator char. Positive or negative integer.
position-in-file-name	Position in the file name split by file name separator char. Positive or negative integer.
path-regex	Regular expression matched on the path
file-name-regex	Regular expression matched on the file name
remove-extension	Whether to remove the extension from the path or file name. Default is true.

Each regular expression has two settings:

Table 3. regular expression settings
Option	Explanation
expression	The regular expression.
group	the number of the group containing the value (default is 0 for entire expression).

Predefined properties

The mapper provides some predefined properties that can be used in attribute mappings without additional configuration in the property mappings.

Table 4. predefined properties
Property	Content
_path	Absolute path of the file (string)
_name	File name (string including extension)
_parent	Path of the file’s parent directory (string)
_length	Length of the file (long)
_extension	Extension of the file name only (string)
_last_modified	Last modified timestamp of the file (in IS0-8601 format)

Attribute mapping

The collected properties (and the predefined properties) are mapped to arveo attributes in the attribute mappings. The key of each attribute mapping must be a collected or predefined property name. Each attribute mapping can use the following parameters:

Table 5. attribute mapping parameters
Parameter	Explanation
attribute-name	The name of the arveo attribute (in snake-case)
type	The type of the attribute (`SHORT`, `INTEGER`, `LONG`, `DOUBLE`, `STRING`, `BOOLEAN`, `DATE_TIME`, `DATE` or `TIME`)
array	Whether the attribute is multivalued or not (the default is false).
delimiter	The delimiter of multivalued attributes. Ignored when `array` is set to false.
date-pattern	The pattern used to parse attributes of type `DATE`.
time-pattern	The pattern used to parse attributes of type `TIME`.
date-time-pattern	The pattern used to parse attributes of type `DATE_TIME`.
zone-id	The time zone ID used when the value in the CSV column for `DATE_TIME` attributes does not contain time zone information.
local-date	The local data for a date-time attribute used if the value does not contain a date. Must be in ISO-8601 format such as '2011-12-03'.
local-time	The local time for a date-time attribute used if the value does not contain a time. Must be in ISO-8601 format such as '10:15' or '10:15:30'
prefix	An optional prefix added to imported attributes of type `STRING`.
suffix	An optional suffix added to imported attributes of type `STRING`.
default-value	The default value used when the line does not contain a value for a configured attribute mapping. Must follow the same format rules as the other values in the column.
number-format.patterm	Format pattern for numeric values.
number-format.grouping-separator	The grouping separator character.
number-format.decimal-separator	The decimal separator character.
unique-identifier	Marks the attribute as a unique identifier or as part of a combined unique identifier that will be used to check if the entity already exists.

The partial default values local-date, local-time and zone-id will not be used for empty values. Use the default-value instead to configure a default for empty values.

Example configuration

The generic file mapper supports individual configurations for each configured import route as shown in the following example:

configuring the generic mapper

universal-import-service:
  file:
    file-import-1: (1)
      uri: "file://..."
  generic-file-mapper:
    settings:
      file-import-1: (2)
        type-definition-name: "files"

1	The ID of the import route for the configured directory
2	The ID of the configuration for the generic file mapper. Must match the ID of the import route.

The following example shows how to configure the generic file mapper. The example collects three properties and maps these and one of the predefined properties to attributes.

example configuration

generic-file-mapper:
  settings:
    file-import-1: (1)
      conflict-resolution-strategy: UPDATE_ATTRIBUTES
      skip-empty-documents: true
      type-definition-name: "files" (2)
      properties: (3)
        parent-name:
          position-in-path: -2
        drive-name:
          position-in-path: 1
        language:
          file-name-regex:
            expression: "[-,_]([A-Z]{2})$"
            group: 1
      attributes: (4)
        _name:
          attribute-name: file_name
          type: STRING
          unique-identifier: true
        parent-name:
          attribute-name: parent_name
          type: STRING
        language:
          attribute-name: language
          type: STRING
        drive-name:
          attribute-name: drive_name
          type: STRING

1	Name of the configuration. Must match the name of the configured import route.
2	Name of the type definition to import into
3	Map of collected properties
4	Mapping from properties to attributes

Extensions

The generic file importer provides an extension mechanism that allows preprocessing of imported files, for example to perform decryption. To use this extension mechanism, a bean of type de.eitco.uis.mappers.common.FilePreprocessor must be registered with a custom Spring Boot Starter. FilePreprocessors can influence the way the actual content of the file is read, as well as properties like file length, name and path. This mechanism can also be used to provide attributes that will be added to the arveo entity created for the file. This allows to implement an import that reads files from a directory that do not contain the actual content to import, but a link to another file containing the actual content and a collection of attributes to import.

A second extension mechanism allows to resolve attributes from arbitrary sources for an imported file. For example, the extension could read a YAML file containing additional attributes. To use this extension mechanism, a bean of type de.eitco.uis.mappers.common.AttributesProvider must be registered. The attributes returned by an AttributesProvider are added after the regular attribute mapping was performed. So it is possible to overwrite already resolved attributes using an AttributesProvider.

It is possible to register several FilePreprocessor and AttributesProvider beans. The bean used for a specific import route is selected by calling the bean’s usedForRoute method with the ID of the route. The first bean found that returns true in this method will be used.

The required classes for the extensions are contained in the following artifact:

<dependency>
    <groupId>de.eitco.uis</groupId>
    <artifactId>universal-import-mappers-common</artifactId>
    <version>${import.service.version}</version>
</dependency>

Writing a custom file mapper

Custom file mappers have to implement the interface FileMapper. Mappers can use the typed or the generic API of arveo. A mapper that uses the generic API has to return true in the isGeneric method implementation and has to implement the mapFileGeneric method. Typed mappers have to return false in the isGeneric method and have to implement the mapFile method.

The custom mapper implementation has to be registered as a Spring bean in a custom Spring Boot starter. To replace the provided default mapper, the custom auto starter either has to run before the auto configuration class de.eitco.uis.mappers.file.GenericFileMapperAutoConfiguration or the default mapper bean registration has to be disabled by setting the property universal-import-service.generic-file-mapper.enabled to false.

File mappers return a list of batch operations. The arveo entities created for one line can be created using the respective batch operation(s). The operations will be executed in the order in which they are contained in the list.

The custom mapper can be activated by adding the jar of the custom starter to the service’s libs directory configured by the parameter -Dloader.path.

Configuration Properties

generic-csv-mapper.settings

Property Type Description Default value

Property	Type	Description	Default value
attribute-null-value	String	The string representing a null value of an attribute. An attribute with this value will be set to null when an entity is updated during an import.	`$null`
attributes	Map	Attribute mappings.
attributes.<key>.array	boolean	Whether the attribute is multivalued.	`false`
attributes.<key>.attribute-name	String	The name of the attribute in arveo.
attributes.<key>.date-pattern	String	The pattern used to parse date values.
attributes.<key>.date-time-pattern	String	The pattern used to parse date-time values.
attributes.<key>.default-value	String	Optional default value.
attributes.<key>.delimiter	String	The delimiter of multivalued attributes.
attributes.<key>.local-date	String	The local data for a date-time attribute used if the value does not contain a date. Must be in ISO-8601 format such as '2011-12-03'.
attributes.<key>.local-time	String	The local time for a date-time attribute used if the value does not contain a time. Must be in ISO-8601 format such as '10:15' or '10:15:30'
attributes.<key>.number-format.decimal-separator	char	The decimal separator used to parse numbers.	`.`
attributes.<key>.number-format.grouping-separator	char	The grouping separator used to parse numbers.	`,`
attributes.<key>.number-format.pattern	String	Non-localized pattern string for the number format. See {@link java.text.DecimalFormat}.	`#,##0.###`
attributes.<key>.prefix	String	Optional prefix for STRING attributes.
attributes.<key>.suffix	String	Optional suffix for STRING attributes.
attributes.<key>.time-pattern	String	The pattern used to parse time values.
attributes.<key>.type	AttributeType	The type of the attribute.
attributes.<key>.unique-identifier	boolean	If true, the value of this attribute will be used to check if an entity with the same value already exists. If so, the current entity will be skipped.	`false`
attributes.<key>.zone-id	String	The zone ID used for date-time attribute if the value does not contain zone information.
conflict-resolution-strategy	ConflictResolutionStrategy	This determines how conflicts, such as existing entities with the same identifier, are handled. <ul> <li>SKIP: The entity will be skipped without making changes.</li> <li>UPDATE: The entity will be updated with the new data.</li> <li>UPDATE_ATTRIBUTES: Only attributes of the entity will be updated, ignoring its content.</li> <li>OVERWRITE: The current version of the entity will be entirely overwritten.</li> <li>OVERWRITE_ATTRIBUTES: Only attributes of the entity will be overwritten, ignoring its content.</li> </ul> The default strategy is set to SKIP.	`skip`
counting.content.content-element-name	String	The name of the content element to use.
date-pattern	String	Default pattern for date values. Can be overridden by attribute specific setting.	`u-M-d`
date-time-pattern	String	Default pattern for date-time values. Can be overridden by attribute specific setting.	`[u-M-d]['T'][H:m:s][X]`
entity-type	EntityType	The type of entity created by the mapper (DOCUMENT or CONTAINER).	`document`
local-date	String	The local data for a date-time attribute if the value does not contain a date. Must be in ISO-8601 format such as '2011-12-03'.
local-time	String	The local time for a date-time attribute if the value does not contain a time. Must be in ISO-8601 format such as '10:15' or '10:15:30'
mode	MapperMode	The mode used by the mapper.	`simple`
number-format.decimal-separator	Character	The decimal separator used to parse numbers.	`.`
number-format.grouping-separator	Character	The grouping separator used to parse numbers.	`,`
number-format.pattern	String	Non-localized pattern string for the number format. See java.text.DecimalFormat.	`#,##0.###`
referencing.container-type-definition-name	String	The name of the type definition that will contain the record containers.
referencing.content.content-element-name	String	The name of the content element to use.
referencing.document-attributes	Map	Attribute mappings for the document entities. Can be empty.
referencing.document-attributes.<key>.array	boolean	Whether the attribute is multivalued.	`false`
referencing.document-attributes.<key>.attribute-name	String	The name of the attribute in arveo.
referencing.document-attributes.<key>.date-pattern	String	The pattern used to parse date values.
referencing.document-attributes.<key>.date-time-pattern	String	The pattern used to parse date-time values.
referencing.document-attributes.<key>.default-value	String	Optional default value.
referencing.document-attributes.<key>.delimiter	String	The delimiter of multivalued attributes.
referencing.document-attributes.<key>.local-date	String	The local data for a date-time attribute used if the value does not contain a date. Must be in ISO-8601 format such as '2011-12-03'.
referencing.document-attributes.<key>.local-time	String	The local time for a date-time attribute used if the value does not contain a time. Must be in ISO-8601 format such as '10:15' or '10:15:30'
referencing.document-attributes.<key>.number-format.decimal-separator	char	The decimal separator used to parse numbers.	`.`
referencing.document-attributes.<key>.number-format.grouping-separator	char	The grouping separator used to parse numbers.	`,`
referencing.document-attributes.<key>.number-format.pattern	String	Non-localized pattern string for the number format. See {@link java.text.DecimalFormat}.	`#,##0.###`
referencing.document-attributes.<key>.prefix	String	Optional prefix for STRING attributes.
referencing.document-attributes.<key>.suffix	String	Optional suffix for STRING attributes.
referencing.document-attributes.<key>.time-pattern	String	The pattern used to parse time values.
referencing.document-attributes.<key>.type	AttributeType	The type of the attribute.
referencing.document-attributes.<key>.unique-identifier	boolean	If true, the value of this attribute will be used to check if an entity with the same value already exists. If so, the current entity will be skipped.	`false`
referencing.document-attributes.<key>.zone-id	String	The zone ID used for date-time attribute if the value does not contain zone information.
referencing.reference-field-name	String	The name of the attribute in the document type definition containing the foreign key to the record containers. If not set, no reference will be created automatically.
referencing.reference-target-field-name	String	The name of the attribute in the container type definition referenced by the foreign key. If not set, the ID system field will be used.
simple.content.file-name-mappings	Map	Mappings from file names to content element names.
simple.content.position-mappings	Map	Mappings from the position of the file name in the list (starting at 0) to content element names.
skip-empty-documents	boolean	If true, documents with no binary content (no files or only empty files) will be skipped. If false, a document containing only metadata will be created.	`false`
time-pattern	String	Default pattern for time values. Can be overridden by attribute specific setting.	`H:m:s`
type-definition-name	String	The name of the type definition the entities are imported into.
zone-id	String	Default zone ID used when date-time pattern does not contain zone information. Can be overridden by attribute specific setting.

attribute-null-value

String

The string representing a null value of an attribute. An attribute with this value will be set to null when an entity is updated during an import.

$null

attributes

Map

Attribute mappings.

attributes.<key>.array

boolean

Whether the attribute is multivalued.

false

attributes.<key>.attribute-name

String

The name of the attribute in arveo.

attributes.<key>.date-pattern

String

The pattern used to parse date values.

attributes.<key>.date-time-pattern

String

The pattern used to parse date-time values.

attributes.<key>.default-value

String

Optional default value.

attributes.<key>.delimiter

String

The delimiter of multivalued attributes.

attributes.<key>.local-date

String

The local data for a date-time attribute used if the value does not contain a date. Must be in ISO-8601 format such as '2011-12-03'.

attributes.<key>.local-time

String

The local time for a date-time attribute used if the value does not contain a time. Must be in ISO-8601 format such as '10:15' or '10:15:30'

attributes.<key>.number-format.decimal-separator

char

The decimal separator used to parse numbers.

.

attributes.<key>.number-format.grouping-separator

char

The grouping separator used to parse numbers.

,

attributes.<key>.number-format.pattern

String

Non-localized pattern string for the number format. See {@link java.text.DecimalFormat}.

#,##0.###

attributes.<key>.prefix

String

Optional prefix for STRING attributes.

attributes.<key>.suffix

String

Optional suffix for STRING attributes.

attributes.<key>.time-pattern

String

The pattern used to parse time values.

attributes.<key>.type

AttributeType

The type of the attribute.

attributes.<key>.unique-identifier

boolean

If true, the value of this attribute will be used to check if an entity with the same value already exists. If so, the current entity will be skipped.

false

attributes.<key>.zone-id

String

The zone ID used for date-time attribute if the value does not contain zone information.

conflict-resolution-strategy

ConflictResolutionStrategy

This determines how conflicts, such as existing entities with the same identifier, are handled. <ul> <li>SKIP: The entity will be skipped without making changes.</li> <li>UPDATE: The entity will be updated with the new data.</li> <li>UPDATE_ATTRIBUTES: Only attributes of the entity will be updated, ignoring its content.</li> <li>OVERWRITE: The current version of the entity will be entirely overwritten.</li> <li>OVERWRITE_ATTRIBUTES: Only attributes of the entity will be overwritten, ignoring its content.</li> </ul> The default strategy is set to SKIP.

skip

counting.content.content-element-name

String

The name of the content element to use.

date-pattern

String

Default pattern for date values. Can be overridden by attribute specific setting.

u-M-d

date-time-pattern

String

Default pattern for date-time values. Can be overridden by attribute specific setting.

[u-M-d]['T'][H:m:s][X]

entity-type

EntityType

The type of entity created by the mapper (DOCUMENT or CONTAINER).

document

local-date

String

The local data for a date-time attribute if the value does not contain a date. Must be in ISO-8601 format such as '2011-12-03'.

local-time

String

The local time for a date-time attribute if the value does not contain a time. Must be in ISO-8601 format such as '10:15' or '10:15:30'

mode

MapperMode

The mode used by the mapper.

simple

number-format.decimal-separator

Character

The decimal separator used to parse numbers.

.

number-format.grouping-separator

Character

The grouping separator used to parse numbers.

,

number-format.pattern

String

Non-localized pattern string for the number format. See java.text.DecimalFormat.

#,##0.###

referencing.container-type-definition-name

String

The name of the type definition that will contain the record containers.

referencing.content.content-element-name

String

The name of the content element to use.

referencing.document-attributes

Map

Attribute mappings for the document entities. Can be empty.

referencing.document-attributes.<key>.array

boolean

Whether the attribute is multivalued.

false

referencing.document-attributes.<key>.attribute-name

String

The name of the attribute in arveo.

referencing.document-attributes.<key>.date-pattern

String

The pattern used to parse date values.

referencing.document-attributes.<key>.date-time-pattern

String

The pattern used to parse date-time values.

referencing.document-attributes.<key>.default-value

String

Optional default value.

referencing.document-attributes.<key>.delimiter

String

The delimiter of multivalued attributes.

referencing.document-attributes.<key>.local-date

String

The local data for a date-time attribute used if the value does not contain a date. Must be in ISO-8601 format such as '2011-12-03'.

referencing.document-attributes.<key>.local-time

String

The local time for a date-time attribute used if the value does not contain a time. Must be in ISO-8601 format such as '10:15' or '10:15:30'

referencing.document-attributes.<key>.number-format.decimal-separator

char

The decimal separator used to parse numbers.

.

referencing.document-attributes.<key>.number-format.grouping-separator

char

The grouping separator used to parse numbers.

,

referencing.document-attributes.<key>.number-format.pattern

String

Non-localized pattern string for the number format. See {@link java.text.DecimalFormat}.

#,##0.###

referencing.document-attributes.<key>.prefix

String

Optional prefix for STRING attributes.

referencing.document-attributes.<key>.suffix

String

Optional suffix for STRING attributes.

referencing.document-attributes.<key>.time-pattern

String

The pattern used to parse time values.

referencing.document-attributes.<key>.type

AttributeType

The type of the attribute.

referencing.document-attributes.<key>.unique-identifier

boolean

If true, the value of this attribute will be used to check if an entity with the same value already exists. If so, the current entity will be skipped.

false

referencing.document-attributes.<key>.zone-id

String

The zone ID used for date-time attribute if the value does not contain zone information.

referencing.reference-field-name

String

The name of the attribute in the document type definition containing the foreign key to the record containers. If not set, no reference will be created automatically.

referencing.reference-target-field-name

String

The name of the attribute in the container type definition referenced by the foreign key. If not set, the ID system field will be used.

simple.content.file-name-mappings

Map

Mappings from file names to content element names.

simple.content.position-mappings

Map

Mappings from the position of the file name in the list (starting at 0) to content element names.

skip-empty-documents

boolean

If true, documents with no binary content (no files or only empty files) will be skipped. If false, a document containing only metadata will be created.

false

time-pattern

String

Default pattern for time values. Can be overridden by attribute specific setting.

H:m:s

type-definition-name

String

The name of the type definition the entities are imported into.

zone-id

String

Default zone ID used when date-time pattern does not contain zone information. Can be overridden by attribute specific setting.

generic-file-mapper.settings

Property Type Description Default value

Property	Type	Description	Default value
attributes	Map	Attribute mappings.
attributes.<key>.array	boolean	Whether the attribute is multivalued.	`false`
attributes.<key>.attribute-name	String	The name of the attribute in arveo.
attributes.<key>.date-pattern	String	The pattern used to parse date values.
attributes.<key>.date-time-pattern	String	The pattern used to parse date-time values.
attributes.<key>.default-value	String	Optional default value.
attributes.<key>.delimiter	String	The delimiter of multivalued attributes.
attributes.<key>.local-date	String	The local data for a date-time attribute used if the value does not contain a date. Must be in ISO-8601 format such as '2011-12-03'.
attributes.<key>.local-time	String	The local time for a date-time attribute used if the value does not contain a time. Must be in ISO-8601 format such as '10:15' or '10:15:30'
attributes.<key>.number-format.decimal-separator	char	The decimal separator used to parse numbers.	`.`
attributes.<key>.number-format.grouping-separator	char	The grouping separator used to parse numbers.	`,`
attributes.<key>.number-format.pattern	String	Non-localized pattern string for the number format. See {@link java.text.DecimalFormat}.	`#,##0.###`
attributes.<key>.prefix	String	Optional prefix for STRING attributes.
attributes.<key>.suffix	String	Optional suffix for STRING attributes.
attributes.<key>.time-pattern	String	The pattern used to parse time values.
attributes.<key>.type	AttributeType	The type of the attribute.
attributes.<key>.unique-identifier	boolean	If true, the value of this attribute will be used to check if an entity with the same value already exists. If so, the current entity will be skipped.	`false`
attributes.<key>.zone-id	String	The zone ID used for date-time attribute if the value does not contain zone information.
conflict-resolution-strategy	ConflictResolutionStrategy	This determines how conflicts, such as existing entities with the same identifier, are handled. <ul> <li>SKIP: The entity will be skipped without making changes.</li> <li>UPDATE: The entity will be updated with the new data.</li> <li>UPDATE_ATTRIBUTES: Only attributes of the entity will be updated, ignoring its content.</li> <li>OVERWRITE: The current version of the entity will be entirely overwritten.</li> <li>OVERWRITE_ATTRIBUTES: Only attributes of the entity will be overwritten, ignoring its content.</li> </ul> The default strategy is set to SKIP.	`skip`
content-element-name	String	The name of the content element in the type definition that will contain the file’s data.
date-pattern	String	Default pattern for date values. Can be overridden by attribute specific setting.	`u-M-d`
date-time-pattern	String	Default pattern for date-time values. Can be overridden by attribute specific setting.	`[u-M-d]['T'][H:m:s][X]`
entity-type	EntityType	The type of entity created by the mapper (DOCUMENT or CONTAINER).	`document`
file-name-property-separator	String	An optional separator used to split the filename in parts which can then be used to extract properties. Note that this is a regular expression!
local-date	String	The local data for a date-time attribute if the value does not contain a date. Must be in ISO-8601 format such as '2011-12-03'.
local-time	String	The local time for a date-time attribute if the value does not contain a time. Must be in ISO-8601 format such as '10:15' or '10:15:30'
properties	Map	Mappings for properties that can be extracted from the path and file name. Properties are then used as sources for the attribute mappings.
skip-empty-documents	boolean	If true, documents with no binary content (no files or only empty files) will be skipped. If false, a document containing only metadata will be created.	`false`
time-pattern	String	Default pattern for time values. Can be overridden by attribute specific setting.	`H:m:s`
type-definition-name	String	The name of the type definition the entities are imported into.
zone-id	String	Default zone ID used when date-time pattern does not contain zone information. Can be overridden by attribute specific setting.

attributes

Map

Attribute mappings.

attributes.<key>.array

boolean

Whether the attribute is multivalued.

false

attributes.<key>.attribute-name

String

The name of the attribute in arveo.

attributes.<key>.date-pattern

String

The pattern used to parse date values.

attributes.<key>.date-time-pattern

String

The pattern used to parse date-time values.

attributes.<key>.default-value

String

Optional default value.

attributes.<key>.delimiter

String

The delimiter of multivalued attributes.

attributes.<key>.local-date

String

The local data for a date-time attribute used if the value does not contain a date. Must be in ISO-8601 format such as '2011-12-03'.

attributes.<key>.local-time

String

The local time for a date-time attribute used if the value does not contain a time. Must be in ISO-8601 format such as '10:15' or '10:15:30'

attributes.<key>.number-format.decimal-separator

char

The decimal separator used to parse numbers.

.

attributes.<key>.number-format.grouping-separator

char

The grouping separator used to parse numbers.

,

attributes.<key>.number-format.pattern

String

Non-localized pattern string for the number format. See {@link java.text.DecimalFormat}.

#,##0.###

attributes.<key>.prefix

String

Optional prefix for STRING attributes.

attributes.<key>.suffix

String

Optional suffix for STRING attributes.

attributes.<key>.time-pattern

String

The pattern used to parse time values.

attributes.<key>.type

AttributeType

The type of the attribute.

attributes.<key>.unique-identifier

boolean

If true, the value of this attribute will be used to check if an entity with the same value already exists. If so, the current entity will be skipped.

false

attributes.<key>.zone-id

String

The zone ID used for date-time attribute if the value does not contain zone information.

conflict-resolution-strategy

ConflictResolutionStrategy

skip

content-element-name

String

The name of the content element in the type definition that will contain the file’s data.

date-pattern

String

Default pattern for date values. Can be overridden by attribute specific setting.

u-M-d

date-time-pattern

String

Default pattern for date-time values. Can be overridden by attribute specific setting.

[u-M-d]['T'][H:m:s][X]

entity-type

EntityType

The type of entity created by the mapper (DOCUMENT or CONTAINER).

document

file-name-property-separator

String

An optional separator used to split the filename in parts which can then be used to extract properties. Note that this is a regular expression!

local-date

String

The local data for a date-time attribute if the value does not contain a date. Must be in ISO-8601 format such as '2011-12-03'.

local-time

String

The local time for a date-time attribute if the value does not contain a time. Must be in ISO-8601 format such as '10:15' or '10:15:30'

properties

Map

Mappings for properties that can be extracted from the path and file name. Properties are then used as sources for the attribute mappings.

skip-empty-documents

boolean

If true, documents with no binary content (no files or only empty files) will be skipped. If false, a document containing only metadata will be created.

false

time-pattern

String

Default pattern for time values. Can be overridden by attribute specific setting.

H:m:s

type-definition-name

String

The name of the type definition the entities are imported into.

zone-id

String

Default zone ID used when date-time pattern does not contain zone information. Can be overridden by attribute specific setting.

universal-import-service

Property Type Description Default value

Property	Type	Description	Default value
csf	Map	Settings for the CSF imports.
csf.thread-pool-size	Integer	The thread pool size used to process the CSF files.	`1`
csf.uri	String	The URI used to import CSF files.
csv	Map	Settings for the CSV imports.
csv.thread-pool-size	Integer	The thread pool size used to process the CSV files.	`1`
csv.uri	String	The URI used to import CSV files.
delete-imported-files	Boolean	Whether to delete files that were successfully imported. This refers to files for the actual content elements.	`false`
delete-skipped-files	Boolean	Whether to delete files that were skipped. This refers to files for the actual content elements.	`false`
file	Map	Settings for the file importer.
file.thread-pool-size	Integer	The thread pool size used to process the files.	`1`
file.uri	String	The URI used to import files.
generic-csv-mapper.settings	Map	Settings for the generic CSV mapper for each import.
generic-file-mapper.settings	Map	Settings for the generic file mapper for each import.
type-name-mappings	Map	Mappings from source type names to type definition names. Used for DDC names found in CSF headers.

csf

Map

Settings for the CSF imports.

csf.thread-pool-size

Integer

The thread pool size used to process the CSF files.

1

csf.uri

String

The URI used to import CSF files.

csv

Map

Settings for the CSV imports.

csv.thread-pool-size

Integer

The thread pool size used to process the CSV files.

1

csv.uri

String

The URI used to import CSV files.

delete-imported-files

Boolean

Whether to delete files that were successfully imported. This refers to files for the actual content elements.

false

delete-skipped-files

Boolean

Whether to delete files that were skipped. This refers to files for the actual content elements.

false

file

Map

Settings for the file importer.

file.thread-pool-size

Integer

The thread pool size used to process the files.

1

file.uri

String

The URI used to import files.

generic-csv-mapper.settings

Map

Settings for the generic CSV mapper for each import.

generic-file-mapper.settings

Map

Settings for the generic file mapper for each import.

type-name-mappings

Map

Mappings from source type names to type definition names. Used for DDC names found in CSF headers.