/
NUTRINTG CDP to SFMC Integration

NUTRINTG CDP to SFMC Integration

Epsilon CDP spec: MJN_CDP outbound extracts to SFMC_FSD_V1.5.docx
SFMC spec: Reckitt Benckiser BU Strategy Document v1.3.docx

Process description in Polish prepared by developer Tristan Szaniawski for RB security department (Bartlomiej.Pokrzywinski@rb.com)procedura.txt

Preconditions:

Brand-Org-Code is unique globally (particular Brand-Org-Code will not be present within two different Markets / Countries).

1) CDP →  2) SFTP → 3) Middleware

 Click here to expand...

15 of ".dat.pgp" and 15 of ".md5" files for each region (EUR, SEA, AMER). Files contain customers data delta (changes) during the last 24h.

FILE SPECIFICATION

Source

CDP DataMart

File Name

RB_<country>_<filetype_YYYYMMDDHHMMSS.dat

Field Delimiter

Pipe (|)

Row Delimiter

CRLF

EncodingUTF8
Primary KeyIndividual_ID

Headers

YES

Encryption

PGP

MD5

Yes

Datetime fields formatYYYY-MM-DD H24:MM:SS   in UTC timezone
15 Files contentCDP to SFMC Data Dictionary 20190307_required .xlsx

Files  content description

 Click here to expand...
CG_SubscriptionProfileActivity_DetailKM_Web_ActivityPromotion_HistorySurveyShopifyCSL
Contains all the subscription an Individual is opted-in to. This table will also contain the brand and market level opt-ins for an Individual. Join back to the customer table with the Individual_ID.Contains a non-consolidated view of all Individuals along with the Source Account and Data Source identifiers. Contains key actions taken by the consumer either through a GWT website, purchased list or Call Centre. Activities include Registration, Profile edit, Sample Request etc.Contains consumer browsing activity as captured by KissMetrics on MJN websites such as video viewed, articles read, etc.Contains all promotion history associated with the caregiver. Promotion history spans all channels including Direct Mail (DM), Email (EM), Text (SM) and Phone (PH).Contains responses to questions asked in market and/or to capture dynamic data such as nutritionist consultants nameCA only. Extract containing Shopify Order Detail and Product data will be created and sent to SFMC for campaigning daily after the CDP Data Mart refresh completes.  The CSL  table contains a listing of customer keys that have been made obsolete. All records listed in the table will programmatically be suppressed by the ESP


CDP Extracts are sent after the Mart refresh is complete for a particular region.

Hostsftp.skynet.epsilon.com 
Port22
AuthenticationSSH (RSA)

Region

UAT Username

PROD Username

SFTP folderUploading start time (UTC)Uploading end time (UTC)

SEA

rbnk_q_sfmc_sea

rbnk_p_sfmc_sea

(username)_<country>/outgoing  

20:3023:30

EU

rbnk_q_sfmc_eu

rbnk_p_sfmc_eu

23:3000:10

AMER

rbnk_q_sfmc_amer

rbnk_p_sfmc_amer

09:4511:45


Applicatoin description

"cdp-to-sfmc-files-integration" application is deployed on cloud and has entity for each of the region: AMER-PROD-1, EU-PROD-1, SEA-PROD-1. It is started once a day by cron. Configuration is set in the file "application-PROD.properties" and among others contains credentials for connecting with:

  • Middlware EC2 server
  • MongoDB
  • Amazon S3 and Athena services in the AWS cloud

Connection CDP sFTP → EC2 is secured by SSH and EC2 is whitelisted on CDP sFTP.  
Scripts are executed on EC2 with SSH by Mule.

Steps

  1. Download files from CDP sFTP "/outgoing/" folder to Middleware EC2, archive downloaded files.
    "1_download_files.sh" script executed on EC2:
    1. Connect to CDP sFTP
    2. Download all *.DAT.PGP files to "current/encrypted" folder on EC2. Files are downloaded with an approximate speed of 1mln records per hour.
    3. Download all *.MD5 files to "current/md5" folder on EC2
    4. Create "/archive/TTT" catalog on CDP sFTP and move there downloaded files for archival purposes.  TTT is datetime fragment picked from "customer" file.

    5. IF not all files are available on CDP sFTP script will return an error code, which will throw an exception in Mule and stop the process.
      EP-234 - Getting issue details... STATUS
      EP-253 - Getting issue details... STATUS
      EP-346 - Getting issue details... STATUS
      EP-105 - Getting issue details... STATUS

  2. Check md5 to verify files are undamaged (all md5 are copied to one file all.md5 and checked by EC2 script):
    1. Copy checksums (content) from MD5 files to file "all.md5" - "2_calculate_checksums.sh" script.
    2.  - "3_verify_checksums.sh"  script.
  3. Decrypt files to the folder: "current/decrypted" - "4_decrypt_files.sh" script.
  4. Move *.DAT.PGP and *.MD5 files to Middleware archive folder "current/acrhive/TTT", where TTT is current server datetime.
  5. Remove quotes and \n from the content of the fields (otherwise Amazon Athena splits one records to two) - "6_remove_line_brakes.sh" script.
    EP-249 - Getting issue details... STATUS
  6. Transfer CSV(*.dat) files from EC2 "cureent/decrypted" to S3 ??? Is it up-to-date step if we mounted S3 on EC2?
    EP-110 - Getting issue details... STATUS




3) Middleware

 Click here to expand...
  1. Split files per BU using Amazon Athena in a streaming manner. Requests content are available in the folder: "resources/athena-queries/PROD/ctas/[amer|eu|sea]" in the source code.
    EP-59 - Getting issue details... STATUS
  2. Write valid and invalid records to different files
    EP-73 - Getting issue details... STATUS


File specification
Name

deName-marketCode-brandOrgCode-yyyy_MM_dd_HH_mm_ss-***.gz, where:

  • deNameis a data extension name with spaces replaced by underscores, e.g. Promotion_History
  • marketCodeis a market code, e.g. MEX
  • brandOrgCodeis a brand org code, e.g. MJNMEX
  • yyyy_MM_dd_HH_mm_ssdate part is always the same during single export process because it is a start time of a process on our side (not an exact transfer date to SFMC SFTP)
  • ***is an optional part that does not affect the automation file pattern, e.g. unique identifier, count, random sequence of characters etc.
FormatCSV
HeadersNo
DelimiterSoH  (ASCII /001)
Compression

Each file is GZIPped (.gz) into it’s own archive.

However:  EP-224 - Getting issue details... STATUS





3) Middleware → 4) SFTP → 5)SFMC

 Click here to expand...
  1. Upload files from S3 to each BU sFTP according to config in MongoDB (to two BU main and)
    File destination:

    **/Triggered_Automations/CDP Import**

  2. If at any stage of the whole process some error occurs error email is sent. 

    EP-151 - Getting issue details... STATUS

    EP-47 - Getting issue details... STATUS

    EP-102 - Getting issue details... STATUS

Target Data extensionRefreshContain deletes?

Customer

Delta

Yes

CG_Subscription

Delta

No

Profile

Delta

Yes

Profile_Item_List

Delta

No

Child

Delta

Yes

Child_Enrollment

Delta

No

Activity_Detail

Delta

No

Samp_Req_Detail

Delta

No

KM_Web_Activity

Delta

No

Promotion_History

Delta

No

Survey

Delta

No

Shopify

Delta

No

Voucher_Redemption

Delta

No

CSL

Delta

No