SAP S4 HANA Data Migration Overview

Data migration is one of those topics that seems simple from a concept perspective but is incredibly complex. The simple idea is to take data from a source legacy system & move it to a target system. How hard could that be? Well, when the data models are extremely different it can be pretty difficult. Simple database migrations where the source and target data models & values will be the exact same are simple. Data migrations where there are two different ERPs & data models are complex. Let’s look at the key steps as part of a S4 HANA data migration project:

Identify Scope of Objects Required for S4 HANA Data Migration (For example: Customers, Vendors, Materials etc.)
Create a Source Data Cleansing Plan & Approach
Determine which ETL (extract, transform, load) Tool to Use and high level technical data base architecture
Determine Number of Mock Conversions Required & scope/targets per mock conversion (mock conversions are an end to end simulation of the extract/transform/load process which is typically used to feed data for testing cycles)
Create Source Data Definition Documents
Create Target Data Definition Documents
Identify Source Data Relevancy Rules (For example: exclude records that are inactive with no business transactions for X number of years)
Create a Source to Target Functional Design & Mapping Document detailing how each field required in the target system will be populated
Determine the S4 HANA Migration Data Load Mechanism for Each Object (LSMW, LTMC, IDOC, BAPI etc.)
Determine the Technical Extract Mechanisms
Create the Technical Migration Design & Mapping Document
Determine the Data Validation & Reporting Approaches
Run Mock Conversions
Triage Defects & Incorporate Fixes into Future Mocks
Create a Detailed Cutover Plan including the data conversion & non data conversion steps

As you can see the simple concept of pulling data from a source to a target becomes involved especially in the complex world of SAP.

Let’s start to break down each of the steps as it relates to data conversion

Identify Scope of Data Objects Required for S4 HANA Data Migration

The first step is to identify the scope of objects required for S4 HANA data migration and categorize them into three key areas of data

Master Data – Data that changes somewhat infrequently depending on the object. Master data is used as a foundation for higher level data objects such as purchase orders. Examples of mater data is customers, vendors & materials. Business transactions do not primarily focus on master data, but they include master data within them
Conditional Data – Conditional data is usually a mix of one or more master data objects with additional fields or provides additional information for master data. Conditional data is an input to business transactional data similar to master data. Examples of conditional data is pricing which often has a lot of conditions such as master data & volume of transactions. Other examples are customer material information records or purchasing information records (additional data about a material and vendor)
Transactional Data – This is the data that stores business transactions in the system. Examples are inventory, purchase orders, contracts, sales orders etc. Transactional data stores information about actual business transactions and relies on master data and conditional data as input

Data Categories Relevant for SAP S4 Migration

Let’s explore a bit into each of the object categories above:

Master Data

When it comes to key master data objects there are three critical objects that are the foundation for the remaining S4 HANA data conversion. Mistakes in these objects cascades to the rest of the system.

Material Master – This consists of both the products sold to the customers and all the raw/in process materials used in production. Depending on the procurement process this can also include all the indirect materials such as office supplies & services.
Customer Master – This is the end to end list of customers that a company sells products and services to
Vendor Master – This is the list of vendors the a company uses to get products and services
Profit Center – This is a logical break up of a part of a company that generates profits. An example is a retail store within a company that generates profits
Cost Center – Cost centers are logical break ups of a company that provide services that do not directly lead to profit directly but are necessary for the company to function. Companies consider IT service divisions as cost centers for example, because they do not sell IT services outside the company. However, the company relies on them for its operations.
G/L Accounts – These are general ledger accounts that are required on balance sheets for financial reporting
Bank Master – This is SAPs method of storing banks since multiple customer or vendors can use the same bank it is created as a separate record that can be assigned to a customer/vendor

These master data objects listed above will be essential in converting the rest of the objects. For details on the data tables that make up key master data objects in S4 check out the link below:

https://techconsultinghub.com/2022/09/09/sap-s4-master-data-key-objects-tables/

Conditional Data

As discussed above conditional data is usually a mixture of master data along with additional information. Let’s take a look at some of the key objects

Vendor Master Pricing Conditions – This is a systematic way to determine the pricing for purchase orders
Customer Pricing Condition Records – This is a systematic way to determine the price for sales orders
Purchasing Information Records (PIRs) – These records provide additional information related to a combination of a material & a vendor
Customer Material Information Records (CMIRs) – These records provide additional information related to a customer & material combination
Product Costing – These records are related to the internal cost of a material that must be loaded on the materials for inventory valuation
Bill of Material (BOM) – This is a list of sub materials needed to create a material

As you can see above, typically, converting conditional data requires master data as a prerequisite. This data serves as an input in loading these records. These data objects provide more specific information to help determine data on transactional records.

Transactional Data

Transactional data is the data that most closely resembles business processes within the system such as placing a purchase order with a vendor. Let’s take a look at some of the key object examples:

Purchase Orders – This data represents the actual requests to vendors for materials
Sales Orders – This data represents the confirmation of materials that will be sold to a customer
Accounts Payables– This data represents open balances for vendors that need to be paid
Accounts Receivables – This data represents open balances for customers that have payments remaining to be received
Material Inventory – This data represents levels of inventory at different areas within the overall supply chain

Transactional data puts all the pieces together; it cannot be loaded until we have loaded the relevant prerequisite master and conditional data. As you can see there is a cascading effect of data where master data is the base that other objects build on. That is why it is so critical to convert accurate master data with high load pass percentages.

It is helpful to organize a large master excel sheet or other spreadsheet tool to organize the full list of data objects along with their category (master, conditional, transactional), data owner(s), dependencies, & estimated volumes of data required for conversion. The foundation of the scope required for data migration is a full data object list. Now that we have a background understanding of all the different data objects that could be necessary to outline the scope, let’s examine an example list of data migration objects in S4 HANA.

SAP S4 Migration Data Object List Example

Create a Source Data Cleansing Plan & Approach

A fundamental aspect of data conversion is cleansing prior to extraction of master data for migration. Not having accurate master data in the source system will result in poor data in the target system. Duplicates not caught in source will create duplicates in the target system. The ideal scenario when starting a new project is to have clean & accurate master data. It is a mistake to assume the transformation process from source to target will create clean data instead of cleansing as much as possible in the source system.

To profile the data is is essential to have business rules. These rules should look at the following items

Determining if there are duplicates in the system
If data is missing in mandatory fields
If incorrect data is populated in fields
If addresses are not standardized

There should be a full list of business rules identified and a reporting mechanism involved such as using SAP information steward to profile the data. This will give a cleansing score & there should be continual work on data cleansing either by individual changes to records or any existing mass tools to be leveraged. If there are simple if then rules you should identify these business rules as targets for transformation instead of data cleansing in source. A data quality reporting dashboard (example shown below) should be utilized in order to drive cleansing

Example of S4 Data Quality Reporting

There should be an excel sheet that has the following information business rule ID, business object, business rule name, data quality category, business rule description, cleansing approach & pseudo code for the business rule along with the fields impacted.

Business Rule Inventory

Determine which ETL (extract, transform, load) Tool to Use

If you’re utilizing SAP S4 HANA the typical tool that is in the landscape will be SAP business objects data services (BODS). It is a robust data migration tool & is the best at handling large complex S4 HANA migrations and utilizing IDOCs as load mechanisms. SAP Rapid data migration, a product offered by SAP, includes a number of SAP built load mechanisms. RDM also includes different documentation to help jump start the data migration process for SAP S4 HANA.

For the majority of SAP customers SAP BODS will be the primary tool. Some other ETL tools such as Informatica or Syniti (formerly Back Office Associates).

At this point I would also like to call out it is possible that no ETL tool will be required at all. In the case there is an SAP ECC to S4 HANA data migration where all the load volumes are relatively small it is possible to use only a load tool such as SAP LTMC (Migration Cockpit) and do any file manipulation needed in excel. This is an unlikely case as most SAP customers are fairly large and would require somewhat complex transformation rules.

Determine Number of Mock Data Conversions Required

A critical part of any data migration effort is to perform end to end tests of extracting data from a source system, pass it through the transformation process & lastly load the data into the target system. As part of this mock data migration process there is also data validation that occurs throughout to ensure the data extracted, transformed & loaded matches what should be occurring in each step. Additionally, data that is converted in a mock data conversion will be utilized during overall system testing to ensure that the data that is converted will work throughout the end to end business process.

One of the first steps of a S4 HANA data migration approach is to map out the number of mock data conversions that will occur throughout the project. Here are a few considerations that will help determine this number

How complex is the overall data migration?
How many pre production systems are in the landscape such as Dev, QA, Pre-Prod etc.?
What is the total volume of master data required for migration?
Is it reasonable for the project to manually create test master data to perform a first round of testing?

For large ERP programs typically typically between 3 or 4 mock conversion cycles prior to the actual cutover (data transfer to the live production system). Each mock cycle will not have the same expectations, in the beginning the expectation is a lower pass rate & typically not full data volumes. As the mocks get closer to go live, the expectation for the amount of master data to be pushed through data migration increases, along with the expected pass percentage.

Let’s take a look at a quick example of a 4 mock conversion cycle approach and what some of the data load volumes would be along with the pass percentage.

	Mock 0	Mock 1	Mock 2	Mock 3
Master Data Volume	25%	80%	100%	100%
Master Data Pass Percentage	50%	75%	90%	95%
Transactional Data Volume	N/A	75%	100%	100%
Transactional Data Pass Percentage	N/A	60%	80%	90%

Mock Acceptance Criteria Sample

As you can see when you progress along the mock cycles you have an increasing amount of master data that should be put through the data migration process and an increasing amount of data that should be successfully loading into the system. As a note, typically the master data volume percentages align with how much data is cleansed. So the more cleansed data the more volume you can put through each mock cycle to find issues faster in the process.

Create Source Data Definition Documents

Data definition documents are one of the key inputs in order to complete an overall data migration project. Source data definition documents refer to the source (legacy) systems data. This data is what will be used for conversion. The idea is that a data definition document should include the following information about the data in the existing system

The full list of data fields that are in use within the source system
The type of those data fields such as free text or a drop down list etc.
Whether the field is required or optional
Any business rules that are related to that field
The technical table & field names
The length of the fields
Whether they contain any sensitive or personally identifiable information (PII)
The description of each of those fields & it’s usage
Data Owners/Stewards for each field

Based on this information it helps inform the cleansing rules that are used. The above information is used as an input to create the final data data mapping document.

Create Target S4 HANA Data Definition Documents

Similar to needing a source data definition, a target data definition is needed to know exactly how the data will need to transform. The target data definition document should have at least the same minimum requirements are the source data definition document

The full list of data fields that are in use within the source system
The type of those data fields such as free text or a drop down list etc.
Whether the field is required or optional
Any business rules that are related to that field
The technical table & field names
The length of the fields
Whether they contain any sensitive or personally identifiable information (PII)
The description of each of those fields & it’s usage
Data Owners/Stewards for each field

S4 Data Definition Example

It should also include any reference tables that should be utilized in order to validate the data values against what is configured in the system. Typically on projects in order to finalize the data definition documents for master data objects it requires tight collaboration with different functional areas such as order to cash, finance, procurement etc. As an example, for the key master data objects below and their organization levels you can see which functional teams are needed to finalize target data definitions. The functional teams should often act as the data owners depending on the data field.

Data Object	Data Subview	Functional Team(s)
Business Partner-Customer	General Data	Order to Cash, Finance
Business Partner-Customer	Sales Data	Order to Cash
Business Partner-Customer	Company Code Data	Finance
Business Partner-Vendor	General Data	Procure to Pay, Finance
Business Partner-Vendor	Purchasing Data	Procure to Pay
Business Partner-Vendor	Company Code Data	Finance
Material Master	General Data	Order to Cash, Finance, Procure to Pay, Plan to Manufacture
Material Master	Sales Views	Order to Cash
Material Master	Manufacturing Views (MRP)	Plan to Manufacture
Material Master	Accounting/Costing	Finance
Material Master	Purchasing View	Procure to Pay

Breakdown of Functional Ownership by Master Data Org Structure

Identify Source Data Relevancy Rules

Data relevancy rules refer to what data from the source system should be included as in scope for the data migration to a S4 HANA. In a data migration process you typically do not simply convert all the existing legacy data into a new system. Often there is old historical data that is no longer needed to be included that should not be converted. The older and less relevant the data that is converted often times there are more issues due to data that is no maintained well or has outdated values. Typically relevancy rules will look at the following criteria

Data that has been created in the last X number of months
Data that has any open business transactions, such as PO’s, SO’s, Contracts etc.
Data that has had a closed transaction within the last X number of months

Below I will provide example relevancy rules that should be discussed with the business. For example, there are times were sales history or manufacturing history is needed because the system uses old historical sales data to predict future sales. In that case it is important to include historical data however for previous customers for example if they haven’t been used in 5+ years there should be no reason to convert that customer to a new system.

Data Object	Sample Relevancy Rule
Business Partner-Customer	Any customer that has been created in the past 6 months Any customer that has an open sales order, contract, or financial document (AR etc.) Any customer that has a closed sales order, contract, financial document within the past 2 years Exclude any customers that have a flag for deletion unless there is a current open transaction
Business Partner-Vendor	Any vendor that has been created in the past 6 months Any vendor that has an open purchase order, contract, or financial document (AP etc.) Any vendor that has a closed purchase order, contract, financial document within the past 2 years Exclude any vendors that have a flag for deletion unless there is a current open transaction
Material Master	Any material that has been created in the past 6 months Any material that has an open purchase order, contract, sales order, inventory or financial document (AP etc.) Any material that has a closed purchase order, contract, financial document within the past 2 years Exclude any materials that have a flag for deletion unless there is a current open transaction

Example relevancy rules for key master data objects

Create a Source to Target Functional Design & Mapping Document

The core documents that will drive the entire S4 HANA data migration process are going to be the functional design for data migration & its associated functional mapping document. A functional design for data conversion refers to a document that is typically in MS word that will outline the end to end process that is required to move the data from a source system to the target system. A functional design for data conversion should include the following sections

Description of the high level process including the relevant source & target system
High level data flow diagram showing the source system, the ETL system & the target system
Relevancy Rules for data migration
Data extraction details from source system
Detailed field mapping document (typically an embedded or linked document that is an excel sheet mapping source to target fields)
Validation approach including the design of any extract/pre/post load report requirements
Error handling requirements such as any advanced error analysis needed based on load methodolgy
Data load approach including how the data will be loaded into the source system along with any dependencies etc.

One of the most important aspects of the functional design for data conversion is the functional mapping document. This is usually a separate excel document that is either embedded or linked. The FMD (functional mapping document) will be a merge of both the source data definition document & the target data definition document along with the transformation required. The document will be as follows, first on the left will be the source data definition document it will have all the relevant fields from the source data definition. In the middle of the excel there will be columns related to the transformation rules required in order to get the data from the source to the target. Below are the common transformation rules

Pass through – This is the most simple transformation rule in which whatever data in the source is passed exactly as is to the target field
Default – This is when regardless of the field value in source (or if there is no source data field available) there is a single default value that should be included in the target system
Cross Reference (X-REF) – This is one of the most common mapping requirements when you have drop down fields. Typically any source drop downs can vary in format to target drop downs. In this case you mark the transformation as an XREF where you will have a list of source values that will map to different target values. For example in source you can have the country as USA but in target the field will be US. Both represent the united states but the data is actually stored differently in each system.
Logic – This is the most complex option within the transformation process. With this type of transformation coding rules will be needed that are often used as if then statements. For example, if the street field is populated grab the last 4 characters if they are numbers else do not pass any data to this field.
Data Construction – This scenario is where there is no value in the source that can be mapped & the target value can not be derived by any logic but needs to be manually determined by the business. In this usually through an excel template the business will provide the values for data object manually.

The last part of the FMD will be the target data fields, this will be columns to the right of the transformation rules that will match the target data definitions. As a note, how the excel works is that you read it from left to right, the source data field on the left then needs to be mapped to the target data field on the right based on the transformation rule in the middle. The row read across should show what source field maps to what target field based on which transformation rule.

S4 Functional Mapping Document (FMD) Example

Determine the S4 HANA Data Migration Load Mechanism for Each Object in S4 HANA

As part of the functional design there is a joint effort between the functional & technical team to align on the best method of loading the data into the system. The key factors that affect which load mechanism to use are listed below

What is the volume of data required for load?
What standard load mechanisms are available for the object?
Will there be a need to potentially perform massive corrections of data that was loaded?
Based on volume of data, is parallel processing required?
Are there any detailed requirements relating to error messages that need to be customized?

Now that we know the questions to ask let’s go over some of the common SAP load mechanisms for master data and their Pro’s & Con’s

S4 HANA Load Mechanism	Pros	Cons
SAP Migration Cockpit (LTMC)	-Simple templates based on either excel or HANA staging tables that allow immediate upload -Auto data validation included based on backend configuration and will provide errors on a simulation report prior to loading -Ability to perform data mapping (XREFs) directly through the migration cockpit prior to loading for any drop down fields -SAPs latest method of loading & allows for a single excel or HANA database file that has different tabs for each organizational area of master data that can be loaded in one upload	-Error reporting can be difficult to read -For massive amounts of data it can have slow load times compared to other data loads with more efficient parallel processing -No ability to mass update, once data is loaded through migration cockpit and created there is no ability to reload with a correction. A different upload tool would need to be used for any errors
SAP IDOCs	-Robust method of data conversion that has been around for a very long time with SAP -Allows for error analysis & reprocessing of failed IDOCs -Can be very fast in terms of parallel processing & loading of data -BODS has direct integrations in order to map data right from a staging table to an IDOC -Allows for mass correction of data in case there are any errors in the load by reloading IDOCs	-Error analysis can sometimes be difficult based on error messages -Analyzing IDOCs themselves can be difficult because it is hard to massively download data that is included in IDOCs
SAP BAPI	-Ability for most customization & speed as this is the standard interface that is called to update master data -Ability for more enhanced data validation & error outputs -Flexibility to customize data conversion since it will be using a custom program	-Custom ABAP program required to call the BAPI -Longest time to build as it requires development
SAP BDC	-Utilize the standard input screens of SAP but through an automated program -Ability for customization of errors etc. as it can be programmed	-Requires a developer to really adapt -Not as fast as other data conversion options
SAP LSMW	-Robust tool that can utilize either BDC, IDOCs, Direct Input or BAPIs to load data -Functional consultants can generate LSMWs based on recordings or IDOCs -Ability to have parallel processing -Simply CSV file upload	-Requires a separate CSV file per table of an object -Not as fast as BODS to IDOC creation -Requires file inputs when converting from older systems -Not supported by SAP at this time. Meaning some S4 functionality could not work for recordings etc.

SAP S4 HANA Data Conversion Load Mechanisms

Now that we have an idea of the different load options, let’s go over what some of the typical load approaches are based on the key master data objects.

Data Object	Load Mechanisms
Material Master	Typically IDOC or BAPI
BP Customer	Migration Cockpit, IDOCs, or BAPI (there are nuances here due to BP model and IDOCs so be careful with load approach)
BP Vendor	Migration Cockpit, IDOCs, or BAPI (there are nuances here due to BP model and IDOCs so be careful with load approach)

Typical Master Data Load Mechanism based on Data Object

Determine the Technical Data Extract Mechanisms

The next step in the process is to align with the source system on the proper extraction mechanism. There are a number of ways data can be extracted from source depending on the volume and the requirements.

Extraction by performing file downloads such as CSV from the source system and moving to an ETL staging table
Native extractions in the source system with file outputs
Using an ETL tool to pull data directly from the source system by database connection
Extract the data into a database from source and move to an ETL staging table

Depending on the volume of the data and working with the source system the technical extraction method needs to be chosen. Ideally if there is a way for the ETL tool to connect directly to a source database and fetch the data that is the fastest.

Create the Technical Data Conversion Design & Mapping Document

Based on the functional design document, the technical design document will be the technical way that data will be converted. The functional design specifies what should be converted and where. The technical design will explain the how it will be converted. It includes the technical tools used & load mechanisms along with any additional system programs that need to be written.

The functional design mapping document will have a corresponding technical design mapping document that will include the technical code used to perform all the transformations.

Determine the Data Validation & Reporting Approaches

As part of the data migration process there are three main checkpoints that should occur along the way

Post Extraction Validation Report – This report should be utilized to validate the data that was expected to come from source has been extracted correctly.
Pre Load Validation Report – This report should be utilized to confirm that the data has been transformed correctly and is in the correct ready to load format. The data can be checked against the source data to ensure fields were mapped properly.
Post Load Validation Report – This report will confirm that the data has been loaded correctly and will be checked against the pre load validation report to make sure they are matching. Additionally, this report should include pass percentages to show how much data successfully loaded into the target system.

These reports can have different requirements based on the best method of validating the data and how sensitive the data is. Oftentimes there is a representative sample of data that is validated based on a percentage of the total volume of data if it is a large enough data set. For example, if there are a million records converted you could take a random sample of approximately 100K records across different types of data that has been converted in order to validate the load was successful.

S4 HANA ETL Steps & Validation

Lastly there are often specific validation scripts that are ran where the business will run through a set of checks that would make them feel comfortable that the data will be loaded correctly. For example, based on the type of data there might be some critical fields where they want to check all of the records in the validation file where other fields that are less critical than can do a few spot checks to make sure the data load was accurate.

Run Mock S4 HANA Data Conversions

This is the most critical step of a S4 HANA data migration program, it is where there is an end to end test of the entire process including validations. As described earlier there are going to be a number of different mock data conversions with different goals & load targets. They will first require a data migration plan. This is what will eventually feed into the cutover plan, the data conversion plan will outline

What system activities such as configuration need to be in place prior to starting the data conversion
Provide the dependencies across the data conversion objects so there is an order in which the data conversion needs to operate. For example, bank master needs to be loaded before vendor master to assign vendors to banks
The steps within each load along with their durations such as extraction, validations, transformation & load

This data conversion plan should be in a project planning tool such as MS project or Smartsheet where dependencies can be linked and timing can be adjusted based on dependencies running late etc. This will all eventually feed into the overall cutover plan. The mock should start to provide more details into how long activities will take & uncover defects that will need to be resolved by the next round of mock activity. This should also identify any data cleansing issues that need to be performed in the source system prior to the next load.

Triage Defects & Incorporate Fixes into Future Mock Conversions

In this step it is important to analyze any of the issues that were found within the mocks & identify their root cause and how to resolve them. There are a number of issues that can arise that should be handled through different fixes. For example

Identifying cleansing issues – If there were only supposed to be a certain list of values in source and the transformation failed because a value was included that should not have been stored
Identifying transformation issues – In the case where the actual transformation did not work as expected a defect needs to be resolved by the technical build team that created the program for transformation
Identifying reporting issues – Many times when the business is reviewing a report they either need additional information or sometimes data is not provided correctly in the report despite it being correct in the system. In this case the technical team will need to review and update the report
Identifying extraction or load issues – In the case where either the extraction program is incorrectly pulling data or the load program is missing or incorrectly populating data these programs will need to be evaluated by the technical team
Design Changes – Sometimes the expectations for how the data will be transformed don’t match reality. It isn’t until a full mock data conversion that you realize a few additional cases needed to be considered in the data and in this case we would need to update for example the functional mapping document to make sure the data migration is meeting the business needs

After all the defects are identified for a mock there needs to be careful planning to make sure all the changes can be accommodated by the next mock or if they are less critical be pushed out to a future mock. S4 HANA Data migration is often most successful when it is performed as an iterative process that is constantly improving with each mock

Create a Detailed S4 HANA Cutover Plan

As you have been running through each mock cycle there should be a plan that forms that has all the detailed dependencies between the different master data migration objects. Timing around the process steps such as extract, transform, load & validation should be refined and any additional dependencies around configuration & transports should be incorporated. The last step is creating a detailed cutover plan that will not only include everything from a data migration perspective but also anything that needs to be in the production system to start operations. For example, if there are a number of enhancements that require manual steps to be performed such as updating a variable table that isn’t transportable those steps would need to occur.

The goal here is to work across every functional team to get every detailed step down to the minutes to be able to successfully go live. Beyond the functional teams we also need to meet with the business to align on the strategies to close out business activities in the legacy system prior to starting cutover. It’s common to stop creating orders the week of cutover and wait to create them once the new system is live. Additionally, there can be a lot of ramp downs in other areas as you try and have a quiet period during a production go live. Everything will be marked with dependencies on an overall cutover plan that should have come from dress rehearsal, previous mocks & system activities needed to start systems integration testing.

Usually there are also some smoke tests laid out in the cutover plan to quickly test functionality after a go live.

Congratulations! Now you have all the tools you need to plan master data migration all the way from the initial mocks through to cutover. I will be going over more specific details in the S4 HANA data conversion process in future blog posts so consider subscribing to get the latest content. Please provided any questions down below!

SAP S4 HANA Data Migration Overview