The following section shows how to load extracted SAP data to Hadoop.

Connection #

Adding a Destination #

  1. In the main window of the Designer, navigate to Server > Manage Destinations. The window “Manage Destinations” opens.
  2. Click [Add] to create a new destination. The window “Destination Details” opens.
  3. Enter a Name for the destination.
  4. Select the destination Type from the drop-down menu.

Destination Details #

XU_Hadoop_DestinationDetails

Hadoop Settings #

HDFS Web API

Web Hdfs URL
Enter the URL of the REST API. The URL contains the prefix /webhdfs/v1/ and has the following format: http://[host]:[port]/webhdfs/v1/[path]
For more information on Hdfs URLs, see WebHDFS REST API: FileSystem URIs vs HTTP URLs.

User name
Enter a user name with write access to the hadoop destination. If no user name is provided, the default user dr.who is used.

Connect
After entering the web URL and the user, click [Connect] to check if a connection can be established.

File Format #

File type
Select the required file format. You can choose between CSV, Parquet and Json.

XU_Hadoop_DestinationDetails2

CVS Settings

To write data in csv format, no further settings have to be made.

Parquet Settings

To write data in parquet format, no further settings have to be made.

Json Settings

To write data in json format, no further settings have to be made.

Settings #

Opening the Destination Settings #

  1. Create or select an existing extraction (see also Getting Started with Xtract Universal).
  2. Click [Destination]. The window “Destination Settings” opens. Destination-settings

The following settings can be defined for the destination:

Destination Settings #

XU_Hadoop_DestinationEinstellungen

File Name #

File Name determines the name of the target table. You have the following options:

  • Same as name of SAP object: Copy the name of the SAP object
  • Same as name of extraction: Adopt name of extraction
  • Custom: Here you can define your own name.
  • Append timestamp: adds the timestamp in the UTC format (_YYYY_MM_DD_hh_mm_ss_fff) to the file name of the extraction.

Note: If the name of an object does not begin with a letter, it will be prefixed with an ‘x’, e.g. an object by the name _namespace_tabname.csv will be renamed x_namespace_tabname.csv when uploaded to the destination. This is to ensure that all uploaded objects are compatible with Azure Data Factory, Hadoop and Spark, which require object names to begin with a letter or give special meaning to objects whose names start with certain non-alphabetic characters.

Using Script Expressions as Dynamic File Names

Script expressions can be used to generate a dynamic file name. This allows generating file names that are composed of an extraction’s properties, e.g. extraction name, SAP source object. This scenario supports script expressions based on .NET and the following XU-specific custom script expressions:

Input Description
#{Source.Name}# Name of the extraction’s SAP source.
#{Extraction.ExtractionName}# Name of the extraction.
#{Extraction.Type}# Extraction type (Table, ODP, DeltaQ, etc.).
#{Extraction.SapObjectName}# Name of the SAP object the extraction is extracting data from.
#{Extraction.Timestamp}# Timestamp of the extraction.
#{Extraction.SapObjectName.TrimStart("/".ToCharArray())}# Trims the first slash ‘/’ of an SAP object, e.g. /BIO/TMATERIAL to BIO/TMATERIAL, so as not to create an empty folder in a file path.
#{Extraction.SapObjectName.Replace('/', '_')}# Replaces all slashes ‘/’ of an SAP object, e.g. /BIO/TMATERIAL to _BIO_TMATERIAL, so as not to split the SAP object name by folders in a file path.
#{Extraction.Context}# Only for ODP extractions: returns the context of the ODP object (SAPI, ABAP_CDS, etc).
#{Extraction.Fields["[NameSelectionFiels]"].Selections[0].Value}# Only for ODP extractions: returns the input value of a defined selection / filter.
#{TableExtraction.WhereClause}# Only for Table extractions: returns the WHERE clause of the extraction.

For more information on script expressions, see Script Expressions.

Date conversion #

Convert date strings
Converts the character-type SAP date (YYYYMMDD, e.g., 19900101) to a special date format (YYYY-MM-DD, e.g., 1990-01-01). Target data uses a real date data-type and not the string data-type to store dates.

Convert invalid dates to
If an SAP date cannot be converted to a valid date format, the invalid date is converted to the entered value. NULL is supported as a value.

When converting the SAP date the two special cases 00000000 and 9999XXXX are checked at first.

Convert 00000000 to
Converts the SAP date 00000000 to the entered value.

Convert 9999XXXX to
Converts the SAP date 9999XXXX to the entered value.

Existing files #

Replace file: The export process will overwrite existing files.
Append results: The export process will append new data to an already existing file.
Abort extraction: The process will be aborted if the file already exists.

Note: The append operation only works for csv files.

Hadoop Remote Folder #

Enter the name of a folder to write the data in.
Subfolders are also supported and can be entered as follows: Folder/Subfolder1/Subfolder2/. If the entered folder does not exist, a new folder is created.
If no folder is entered, the data will be written into the root folder.

Using Script Expressions as Dynamic Folder Paths

Script expressions can be used to generate a dynamic folder path. This allows generating folder paths that are composed of an extraction’s properties, e.g. extraction name, SAP source object. This scenario supports script expressions based on .NET and the following XU-specific custom script expressions:

Input Description
#{Source.Name}# Name of the extraction’s SAP source.
#{Extraction.ExtractionName}# Name of the extraction.
#{Extraction.Type}# Extraction type (Table, ODP, DeltaQ, etc.).
#{Extraction.SapObjectName}# Name of the SAP object the extraction is extracting data from.
#{Extraction.Timestamp}# Timestamp of the extraction.
#{Extraction.SapObjectName.TrimStart("/".ToCharArray())}# Trims the first slash ‘/’ of an SAP object, e.g. /BIO/TMATERIAL to BIO/TMATERIAL, so as not to create an empty folder in a file path.
#{Extraction.SapObjectName.Replace('/', '_')}# Replaces all slashes ‘/’ of an SAP object, e.g. /BIO/TMATERIAL to _BIO_TMATERIAL, so as not to split the SAP object name by folders in a file path.
#{Extraction.Context}# Only for ODP extractions: returns the context of the ODP object (SAPI, ABAP_CDS, etc).
#{Extraction.Fields["[NameSelectionFiels]"].Selections[0].Value}# Only for ODP extractions: returns the input value of a defined selection / filter.
#{TableExtraction.WhereClause}# Only for Table extractions: returns the WHERE clause of the extraction.

For more information on script expressions, see Script Expressions.