Filesystem Importer

Introduction

The Filesystem Importer can save objects from migration-center to the file system. It can also write metadata for those objects into either separate or a unified XML file. The folder structure (if any) can also be created in the filesystem during import. The filesystem can be either local filesystem or a share accessible via a UNC path.

Importer Configuration

To create a new Filesystem Importer job, specify the respective adapter type in the Importer Properties window – from the list of available connectors “Filesystem” must be selected. Once the adapter type has been selected, the Parameters list will be populated with the parameters specific to the selected adapter type.

The Properties window of an importer can be accessed by double-clicking an importer in the list, or selecting the Properties button/menu item from the toolbar/context menu.

A detailed description is always displayed at the bottom of the window for a selected parameter.

Importer parameters

The common adaptor parameters are described in Common Parameters.

The configuration parameters available for the Filesystem importer are described below:

  • xsltPath The path to the XSL file used for transformation of the meta-data (leave empty for default metadata XML output)

  • unifiedMetadataPath The path and filename where the unified metadata file should be saved; the parent folder must exist, otherwise the import will stop with an error Leave empty to create individual XML metadata files for each object

  • unifiedMetadataRootNodes The list of XML root nodes to be inserted in the unified meta-data file which will contain the document and folder metadata nodes; the default value is “root”, which will create a … element. Multiple values are also supported, separated by “;”, e.g. “root;metadata”, which would create a … structure in the XML file for storing the object’s metadata.

  • moveFiles Flag for moving content files. Unchecked - the content files will be just copied Checked - the content files will be moved (copied and then deleted from original location) Default: false

  • loggingLevel*

    See Common Parameters.

Parameters marked with an asterisk (*) are mandatory.

Migset System Rules

Documents targeted at the filesystem will have to be added to a migration set first. This migration set must be configured to accept objects of type <source object type>ToFilesystem(document).

Create a new migration set and set the <source object type>ToFilesystem(document).object type in the Type drop-down. This is set in the –Migration Set Properties- window which appears when creating a new migration set. The type of object can no longer be changed after a migration set has been created.

  • content_target_file_path Sets the full path, including filename and extension, where the current document should be saved on import. Use the available transformation methods to build a string representing a valid file system path. If not set, the source content will be ignored by the importer. Example: d:\Migration\Files\My Documents\Report for 12-11.xls

  • rendition_source_file_paths Sets the full path, including filename and extension, where a “rendition” file for the current document is located. Use the available transformation methods to build a string representing a valid file system path. Example: \server\share\Source Data\Renditions\Report for 12-11.pdf This is a multi-value rule, allowing multiple paths to be specified if more than one rendition exists (PDF, plain text, and XML for example)

  • rendition_target_file_paths Sets the full path, including filename and extension, where a “rendition” file for the current document should be saved to. Typically this would be somewhere near the main document, but any valid path is acceptable. Use the available transformation methods to build a string representing a valid file system path. Example: d:\Migration\Files\My Documents\Renditions\Report for 12-11.pdf This is a multi-value rule, allowing multiple paths to be specified if more than one rendition exists (PDF, plain text, and XML for example)

  • metadata_file_path The path to the individual metadata file that will be generated for current object.

  • created_date Sets the “Created” date attribute in the filesystem.

  • modified_date Sets the “Modified” date attribute in the filesystem.

  • file_owner Sets the “Owner” attribute in the filesystem. The user e.g. “domain\user” or “jdoe” must either exist in the computer ‘users’ or in the LDAP-directory.

Object Type Definitions

Since the Filesystem doesn’t use different object types for files, the Filesystem Importer doesn’t need this information either. But due to migration-center’s workflow an association with at least one object type needs to exist in order to validate and prepare objects in a migration set for import.

To work around this, any existing migration-center object type definition can be used with the Filesystem Importer. A good practice would be to create a new object type definition containing the attribute values used with the Filesystem Importer, and to use this object type definition for association and validation.

Basic Migration Tasks

Metadata in XML files

In addition to the actual content files, metadata files containing the objects attributes can be created when outputting files from migration-center. These files use a simple XML schema and usually should be placed next to the objects they are related to. It is also possible to collect metadata for all objects imported in a given run to a single metadata file, rather than separate files.

Starting with version 3.2.6 the way of creating objects metadata has become more flexible. The following options are available:

  1. Generate the metadata for each object to an individual xml file. The name and the location of the individual metadata file is now configurable through the system rule “metadata_file_path”. If left empty no individual metadata files will be generated.

  2. Generate the metadata of the imported objects in a single xml file. The name and the location of the unified metadata file will be set in the importer parameter “unifiedMetadataPath”. In this case the system rule “metadata_file_path” must be empty.

  3. Generate the metadata for each object to an individual xml file and create also the unified metadata file. The individual metadata file will be set through the system rule “metadata_file_path” and the unified metadata through the importer parameter “unifiedMetadataPath”

  4. Import only the content of files without generating any metadata file. In this case the system rule “metadata_file_path” and the importer parameter “unifiedMetadataPath” should be left empty.

If one of the goals of importing files and metadata to the filesystem is to be scanned in the future with the filesystem scanner, then the individual metadata file names should comply with the filesystem scanner naming convention. The location of the individual metadata must be the folder where content is exported and the name should be composed from the name and the extension of the content file plus the extension of the metadata file.

For example: If one content file is exported to “d:\export\file1.pdf” the generated individual metadata should be “d:\export\file1.pdf.xml” where “.xml” is the extension you chose for the metadata file.

Metadata file contents

A sample metadata file’s XML structure is illustrated below. The sample content could belong to the report.pdf.fme file mentioned above. In this case the report.pdf file has 4 attributes, each attribute being defined as a name-value pair. There are five lines because one of the attributes is a multi-value attribute. Multi-value attributes are represented by repeating the attribute element with the same name, but different value attribute (i.e. the keywords attribute is listed twice, but with different values)

<?xml version="1.0" encoding="UTF-8"?>
<contentattributes>
    <attribute name="keywords" value="Benchmark" />
    <attribute name="keywords" value="Technical" />
    <attribute name="reference_period" value="26.11.2001" />
    <attribute name="reference_period_from" value="26.11.2001" />
    <attribute name="reference_period_to" value="01.01.2100" />
</contentattributes>

To generate metadata files in a different format than the one above, an XSL template can be used to transform the above XML into another output. To use this functionality a corresponding XSL file needs to be build and its location specified in the importer’s parameters. This way it is possible to obtain XML files in a format that could be processed further by other software if needed. The specified XSL template will apply to both metadata files: individual and unified.

For a unified metadata file it is also possible to specify the name of the root node (through an importer parameter) that will be used to enclose the individual objects’ <contentattributes> nodes.

Created, Modified and FileOwner

Filesystem attributes like created, modified and owner can not only be set in the metadata file but they are also set on the created content file in the operating system. Any source attribute can be used and mapped to one of these attributes in the migset system rules.

Renditions

Even though the filesystem does not explicitly support “renditions”, i.e. representations of the same file in different formats, the Filesystem importer can work with multiple files which represent different formats of the same content. The Filesystem Importer does not and cannot generate these files – “Renditions” would typically come from an external source such as PDF representations of editable Office file formats or technical drawings created using one of the many PDF generation applications available, or renditions extracted by a migration-center scanner from a system which supports such a feature. If files intended to be used as renditions exist, the Filesystem Importer can be configured to get these files from their current location and move them to the import location together with the migrated documents. The “renditions” can then be renamed for example in order to match the name of the main document they relate to; any other transformation is of course also possible. “Renditions” are an optional feature and can be managed through dedicated system rules during the migration. See for more.

Versions

The source data imported with the Filesystem Importer can originate from various content management systems which typically also support multiple versions of the same object.

The Filesystem Importer does not support outputting versioned objects to the filesystem (multiple versions of the same document for example).

This is due to the filesystem’s design which does not support versioning an object, nor creating multiple files in the same folder with the same name. If versions need to be imported using the Filesystem Importer the version information should be appended to the filename attribute to generate unique filenames. This way there will be no conflicting names and the importer will be able to write all files correctly to the location specified by the user.

The source data imported with the Filesystem Importer can originate from various content management systems which can support multiple links for the same object, i.e. one and the same object being accessible in multiple locations.

The Filesystem Importer does not support creating multiple links for objects in the filesystem (the same folder linked to multiple different parent folders for example). If the object to be imported with the Filesystem Importer has had multiple links originally, only the first link will be preserved and used by the Filesystem Importer for creating the respective object. This may put some objects in unexpected locations, depending on how the objects were linked or arranged originally.

Using scanner configuration parameters and/or transformation rules it should be possible to filter out any unneeded links, leaving only the required links to be used by the Filesystem Importer.