Skip to content

Setting up

Adding Composer Dependencies

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
"require": {
    (...)
    "edmondscommerce/magento2-data-migration": "dev-master"
}


"require-dev": {
    (...)
    "magento/data-migration-tool": "2.2.1"
}

"data-migration-tool": {
    "type": "vcs",
    "url": "https://github.com/edmondscommerce/data-migration-tool"
},
"ec-data-migration": {
    "type": "git",
    "url": "ec@gitBare:/home/ec/repos/edmondscommerce/edmondscommerce-dataMigration"
}

Then run composer update

Forking Magento's Migration Tool

Sometimes we might need to make amendments to Magento's tool. For this purpose we have our own fork. As an example of some changes:

user@desktop git --no-pager show 6fbe7cc6763c674e54ff6d805f895562bd8e6a13
commit 6fbe7cc6763c674e54ff6d805f895562bd8e6a13
Author: clientname-magento2 container <container@clientname-magento2>
Date:   Wed Oct 12 09:57:51 2016 +0000

    altering the group by logic as it was not working properly due to the store ID being set to 1 if 0. Needs the same logic in the group by. Also grouping by identifier to be more explicit

diff --git a/src/Migration/Step/UrlRewrite/Version191to2000.php b/src/Migration/Step/UrlRewrite/Version191to2000.php
index d553d3f..a588252 100644
--- a/src/Migration/Step/UrlRewrite/Version191to2000.php
+++ b/src/Migration/Step/UrlRewrite/Version191to2000.php
@@ -80,7 +80,7 @@ class Version191to2000 extends \Migration\Step\DatabaseStage implements Rollback
     protected $structure = [
         MapInterface::TYPE_SOURCE => [
             'core_url_rewrite' => [
-                'url_rewrite_id' ,
+                'url_rewrite_id',
                 'store_id',
                 'id_path',
                 'request_path',
@@ -126,7 +126,8 @@ class Version191to2000 extends \Migration\Step\DatabaseStage implements Rollback
         RecordFactory $factory,
         \Migration\Logger\Logger $logger,
         $stage
-    ) {
+    )
+    {
         parent::__construct($config);
         $this->source = $source;
         $this->destination = $destination;
@@ -150,7 +151,7 @@ class Version191to2000 extends \Migration\Step\DatabaseStage implements Rollback
             $this->structure[MapInterface::TYPE_SOURCE][self::SOURCE],
             array_keys($this->source->getStructure(self::SOURCE)->getFields())
         );
-        $destinationFieldsDiff= array_diff(
+        $destinationFieldsDiff = array_diff(
             $this->structure[MapInterface::TYPE_DEST][self::DESTINATION],
             array_keys($this->destination->getStructure(self::DESTINATION)->getFields())
         );
@@ -310,7 +311,7 @@ class Version191to2000 extends \Migration\Step\DatabaseStage implements Rollback

         $metadata = $this->doRecordSerialization($record)
             ? serialize(['category_id' => $record->getValue('category_id')])
-            : null ;
+            : null;
         $destRecord->setValue('metadata', $metadata);

         $destRecord->setValue('entity_id', $record->getValue('product_id') ?: $record->getValue('category_id'));
@@ -367,8 +368,7 @@ class Version191to2000 extends \Migration\Step\DatabaseStage implements Rollback
         )->where(
             'cp.identifier NOT IN(?)',
             $this->getUrlRewriteRequestPathsSelect()
-        )->group(['request_path', 'cps.store_id']);
-
+        )->group(['cp.identifier', new \Zend_Db_Expr('IF(cps.store_id = 0, 1, cps.store_id)')]);
         return $select;
     }

Keeping our Fork up to date

To keep fork up to date. Pull the changes from the original repo, and push them to our fork.

cd /opt/Projects/{client-name}/vendor/magento/data-migration-tool
git pull magento master # Magento repository
git push origin master # Our fork repository

Migration Processes

Overview

Edmonds Commerce's magento2-data-migration tool is a wrapper around Magento's data migration tool.

It's intended to be added as a composer dependency to an existing Magento 2 site, so make sure to have a M2 site set up first.

The tool operates in three phases:

  1. preRun: Downloads a reference copy of the Magento 1 site (code, database and media) to be migrated
  2. protoType: Allows for test running of the migration. This is intended to be run multiple times
  3. pushToLive:

PreRun

Overview

  • Establishes what version of Magento the live site is using
  • Locally installs a reference clean copy of that version
  • Gets a dump of the live site database and imports it locally
  • Downlads a copy of the live site's media and copies them to the Magento 2 install
  • Applies fixes to the locally imported database that block the migration

PreRun key storage areas

  • The Live Magento 1 database will be stored at "[clientname]_magento1"
  • The Magento 1 reference files will be stored at $vhostRoot/bin/dataMigration/mage1Files/m1ref
  • The Magento 1 reference database will be stored in "[clientname]_magento1_ref"

Setup Steps

  1. Add [mageroot]/bin/dataMigration/mage1Files/mage1Version.txt, a text file containing the magento 1 version.
  2. Copy the local.xml to [mageroot]/bin/dataMigration/mage1Files/local.xml and change the database credentials to the current container.

PreRun procedure

  1. Run this with bash preRun/run.bash [remoteVhostPublicPath] [sshUser] [sshHost]
    1. _01_getM1ReferenceDbAndGenWhiteList.bash
      1. Checks that magerun is installed
      2. Drops, then recreates an empty M1 reference database, granting all priviliges to the user
      3. Deletes any existing M1 reference codebase
      4. Uses magerun to create a reference clean M1 at /home/ec/m1ref, then moves it to the install path
      5. Creates a tableWhiteList.txt based on the reference database, omitting log and session tables
    2. _02_downloadDatabase.bash
      1. Makes a local copy of the live site's app/etc/local.xml and parses out its database connection credentials
      2. SSHs into the live server and runs mysqldump on the live database structure (no data), echoing its output to a local file
      3. Appends the local dump with data only from the tableWhiteList.txt list
    3. _03_importDatabase.bash
      1. Drops the local/beast's copy of the live database and recreates a new empty one
      2. Imports the live database dump into the local/beast database
    4. _04_fixKnownIssues.bash
      1. Looks for, and runs a similar bin/dataMigration/preHooks/_04_fixKnownIssues.bash for client-specific fixes
      2. Removes attributes that exist but aren't mentioned in the eav_attribute table
      3. Removes attribute values where the attribute has been deleted
      4. Removes attributes for entities that no longer exist
      5. Looks for multiple customers with the same email address, and adds a suffix to duplicate entries
      6. Removes tags where the associated customer no longer exists
    5. _05_downloadMedia.bash
      1. Creates M2 media directories for products, categories and wysiwyg
      2. Deletes and recreates a list of images in catalog_product_image.txt that are used by products in the database
      3. Uses rsync to copy the live site's product, category and then wysiwyg images, placing them into the local M2 media folder
      4. Ensures the media images are all owned by the ec user

Prototype

  • Sets up Magento 2 with a clean database
  • Sets up Magento's migration tool by creating working copies of the appropriate distributed XML files
  • Runs multiple test migrations, capturing any errors that occur on each iteration
  • Handles the migration errors by configuring the tool to ignore any problem data

Overview

Config XML files:

  • config.xml is the hub, containing database details, steps to run, and includes the other files
  • map.xml contains information for what to do with non-EAV documents (tables) and fields (columns)
  • map-eav.xml contains information for what to do with EAV data
  • eav-attribute-groups.xml
  • class-map.xml contains information for how to map Magento 1 PHP classes to Magento 2 equivalents
  • move.xml is an Edmonds Commerce file that can optionally be used to keep manually-added map.xml content separate

Prototype key storage areas

  • Config files are stored in [vhost root]/bin/dataMigration
  • Reinstalling Magento 2 is done by trying in the following order:
    1. [vhostRoot]/justInstalledClean.sql.gz which should represent a freshly installed Magento 2
    2. [vhostRoot]/bin/installScript.bash which should contain the magento:install command and anything else required to set up a new Magento 2

Prototype procedure

  1. Run this with bash prototype/run.bash go
    1. _00_checkPreRun.bash
      1. Checks for a predownloaded copy of the live local.xml
      2. Checks that the live database named "[clientname]_magento1" exists
      3. Checks that Magento 2 media folder exists
      4. Checks that the jiraShell container asset is installed
    2. _010_dropAndRebuildDatabase.bash
      1. includes parseLogAndUpdateMapXml.php, which:
      2. ├ Prompts the user for permission to proceed
      3. ├ Requests the password for the beast mysql root user
      4. ├ Drops the Magento 2 database
      5. ├ Creates a new empty Magento 2 database and grants access to the mysql user
      6. ├ Disables non-Magento modules
      7. ├ Installs Magento 2 using either by importing [vhostRoot]/justInstalledClean.sql.gz or running [vhostRoot]/bin/installScript.bash
      8. ├ Reenables all modules except for predefined problematic ones
      9. └ Runs bin/magento setup:upgrade
    3. _020_configureMigrationTool.bash
      1. Backs up, then empties the crontab
      2. Installs the Magento migration tool, if not already installed
      3. Checks for a predefined Magento 1 version number, or prompts the user for one
      4. Copies Magento's distributed config.xml for the specific M1 version to the data migration folder
      5. Updates the copied config.xml with database connection details
      6. Copies Magento's distributed map.xml, map-eav.xml, eav-attribute-groups.xml and class-map.xml for the specific M1 version to the data migration folder
      7. Creates a template move.xml in the data migration folder
    4. _030_runFirstMigration.bash
      1. Runs bin/magento's migrate:settings command, logging its output
      2. Runs bin/magento's migrate:data command, logging its output
      3. includes parseLogAndUpdateMapXml.php, which:
      4. ├ Parses the log output from the above, grouping them into the tool's steps
      5. ├ Iterates over all the steps, looking for unmapped documents and fields
      6. ├ Adds unmapped documents and fields to the ignore list in map.xml and map-eav.xml
      7. └ Queues tickets using the jiraShell container asset for ignored documents and fields
    5. _040_runSecondMigration.bash
      1. Runs bin/magento's migrate:data command, logging its output
      2. includes parseLogAndUpdateMapXml.php, which:
      3. ├ Parses the log output from the above, grouping them into the tool's steps
      4. ├ Iterates over all the steps, looking for unmapped documents and fields
      5. ├ Adds unmapped documents and fields to the ignore list in map.xml and map-eav.xml
      6. └ Queues tickets using the jiraShell container asset for ignored documents and fields
      7. includes parseLogAndUpdateClassMapXml.php, which:
      8. ├ Parses the log output from the above, looking for unmapped classes
      9. ├ Adds unmapped classes to the map-class.xml with empty <to> elements
      10. └ Queues tickets using the jiraShell container asset for ignored classes
      11. includes parseMoveXmlAndUpdateMapXml.php, which:
      12. ├ Parses the log output from the above, looking for unmapped classes
    6. _050_dropAndRebuildDatabase.bash
      1. includes parseLogAndUpdateMapXml.php, which:
      2. ├ Prompts the user for permission to proceed
      3. ├ Requests the password for the beast mysql root user
      4. ├ Drops the Magento 2 database
      5. ├ Creates a new empty Magento 2 database and grants access to the mysql user
      6. ├ Disables non-Magento modules
      7. ├ Installs Magento 2 using either by importing [vhostRoot]/justInstalledClean.sql.gz or running [vhostRoot]/bin/installScript.bash
      8. ├ Reenables all modules except for predefined problematic ones
      9. └ Runs bin/magento setup:upgrade
    7. _060_runFinalMigration.bash
      1. Creates a [vhost root]/var/dataMigration folder
      2. Runs bin/magento's migrate:settings command, logging its output
      3. Runs bin/magento's migrate:data command, logging its output
    8. _070_postImportTasks.bash
      1. Sets the frontend input of all unrecognised types to "text"
      2. Truncates the design_change table
      3. Runs a hardcoded postProcess bash file
    9. _080_cleanUpTasks.bash
      1. Runs bin/magento's setup:di:compile command
      2. Runs bin/magento's cache:flush command
      3. Runs bin/magento's magento indexer:reindex command

Adding custom data migrations

map.xml: for non-EAV tables and columns

map-eav.xml: for EAV data

eav-attribute-groups.xml

class-map.xml: for mapping PHP classes

move.xml: for optional separate config

This is a file used by the Edmonds Commerce migration tool as an addition to the files used by Magento's migration tool. It's intended to separate out project-specific rules. The content of move.xml will be copied to the Magento config files by prototype/_060_parseMoveXmlAndUpdateMapXml.php.

An example of the file is displayed below

<?xml version="1.0" encoding="UTF-8"?>
<!--
/**
 * Copyright © 2013-2017 Magento, Inc. All rights reserved.
 * See COPYING.txt for license details.
 */
-->
<map xmlns:xs="http://www.w3.org/2001/XMLSchema-instance" xs:noNamespaceSchemaLocation="../../map.xsd">
    <source>
        <field_rules>
            <move>
                <field>sales_flat_quote_address.sms</field>
                <to>quote_address.sms</to>
            </move>
            <move>
                <field>sales_flat_order_address.sms</field>
                <to>sales_order_address.sms</to>
            </move>
            <move>
                <field>sales_flat_order.pick_printed</field>
                <to>sales_order.pick_note_printed</to>
            </move>
        </field_rules>
        <document_rules>
            <rename>
                <document>m_misspell_suggest</document>
                <to>mst_misspell_suggest</to>
            </rename>
        </document_rules>
        <attributes>
            <ignore>
                <attribute type="catalog_product">apo_min_price</attribute>
            </ignore>
        </attributes>
    </source>
</map>

The nodes do the following:

  • field_rules: These are used to map a column from one table in the old database to a different one in the new database
  • document_rules: These are used to rename a table
  • attributes: These are used to ignore old attributes and make sure they are not migrated across

finalRun

Overview

To avoid having to generate the files again and again, it it possible to run the dataMigrationToolAndProcess/finalRun.bash file. This will drop and reinstall Magento, run the migration and then clean up the installation.

Step by step migration

Container creation

cd /opt/Projects/snippets-edmondscommerce/Cluster/shellscripts/cluster/setupPublicFacingContainers/setupStagingContainer/
Edit createGenericContainer.bash and change the default php version to 72 (this will be reverted after)
Run bash ./createMagento2Container.bash [clientname] [pubKey] [privKey] {optional container suffix - defaults to staging}
Make sure you put in the password for the magento user when asked for, NOT THE USERNAME.

Install the migration tool

Add the data migration tool and jira shell to composer.json

"require-dev": {
        "edmondscommerce/magento2-data-migration": "dev-Optimise",
        "edmondscommerce/jirashell": "dev-master",
        "magento/data-migration-tool": "2.3.2"
    },

"repositories": [
        {
            "type": "git",
            "url": "ec@gitBare:/home/ec/repos/edmondscommerce/edmondscommerce-dataMigration"
        }
    ],

Then run composer update

Get the M1 database

Create a new database client_name_magento1 and get the latest M1 database from live.

Add files to the bin folder

Create the dataMigration folder

mkdir /bin/dataMigration/mage1Files
Add the mage1Version.txt file to the mage1Files folder and add the m1 version to it (Example: 1.9.3.9)
Then copy the local.xml file into the same folder and change the credentials for the client_name_magento1 database.

Running pre run scripts

Go the edmondscommerce migration tool directory

cd vendor/edmondscommerce/magento2-data-migration/dataMigrationToolAndProcess/preRun
Then run the command that fixes the known issues in the M1 database.
sudo bash
bash _05_fixKnownIssues.bash client_name_magento1 false
Then run the command that gets the media from M1 live.
sudo bash _06_downloadMedia.bash jimnybit 77.104.181.12 /home/jimnybit/public_html/media /var/www/vhosts/www.magento2.jimny.developmagento.co.uk/pub/media jimny_magento1 18765 false

Running the migration

Make sure the correct database details are present in bin/installScript.bash.
Then run the actual migration

/var/www/vhosts/www.magento2.jimny.developmagento.co.uk/vendor/edmondscommerce/magento2-data-migration/dataMigrationToolAndProcess/prototype/run.bash go false

Creating the jira tickets

  1. Create the Migration project in jira
  2. Read the readme.md file in the jirashell repo (vendor/edmondscommerce/jirashell/README.md)

Explanations

Prototype migration

The EC tool tries to run the migration process a few times and adds the error to the map.xml file in the form of ignores.

Fixing the migration errors

  • Check that the table are equivalent - ish (they should roughly have the same fields if not exactly the same)
  • Get the table that was ignored in the prototype phase and re-map it to the m2 table like so:
    <rename>
        <document>aw_pquestion2_question</document>
        <to>aw_pq_question</to>
    </rename>
    
  • Re-run the migration process with dropping the m2 database
  • Check the new errors and see if any fields need to be remapped or ignored

Step by step guide

Container creation

cd /opt/Projects/snippets-edmondscommerce/Cluster/shellscripts/cluster/setupPublicFacingContainers/setupStagingContainer/
Edit createGenericContainer.bash and change the default php version to 72 (this will be reverted after)
Run bash ./createMagento2Container.bash [clientname] [pubKey] [privKey] {optional container suffix - defaults to staging}
Make sure you put in the password for the magento user when asked for, NOT THE USERNAME.

Install the migration tool

Add the data migration tool and jira shell to composer.json

"require-dev": {
        "edmondscommerce/magento2-data-migration": "dev-Optimise",
        "edmondscommerce/jirashell": "dev-master",
        "magento/data-migration-tool": "2.3.2"
    },

"repositories": [
        {
            "type": "git",
            "url": "ec@gitBare:/home/ec/repos/edmondscommerce/edmondscommerce-dataMigration"
        }
    ],

Then run composer update

Get the M1 database

Create a new database client_name_magento1 and get the latest M1 database from live.

Add files to the bin folder

Create the dataMigration folder

mkdir /bin/dataMigration/mage1Files
Add the mage1Version.txt file to the mage1Files folder and add the m1 version to it (Example: 1.9.3.9)
Then copy the local.xml file into the same folder and change the credentials for the client_name_magento1 database.

Running pre run scripts

Go the edmondscommerce migration tool directory

cd vendor/edmondscommerce/magento2-data-migration/dataMigrationToolAndProcess/preRun
The run the command that fixes the known issues in the M1 database.
sudo bash
bash _05_fixKnownIssues.bash client_name_magento1 false