Testing CN

Before executing your first project, it's important to ensure your connectivity to the Senate Matching network, the validity of the Senate Matching process (Tokenisation etc.), and your permissions. This will give you the confidence that you can execute successfully in a real project.

Contributor test script


These steps will help validate that the contributor features of your node have successfully been set up

#

Step

Screenshot & Notes

1

Initial Senate Matching login

  1. Type in / copy the address of your Contributor Node into the browser (https://[host name of your Contributor Node]/dashboard)
    1. Your login credentials are created by your organisation when your Contributor Node is configured.
    2. If you have any issues logging in, please reach out to support@datarepublic.com
  2. The Dashboard is loaded. It shows:
    1. List of databases. At launch, this will show two databases (one for production, and one for testing).
    2. System status. Covers status of the node itself ("is the database working?"), connectivity between the contributor and the Senate Matching network, and status of matcher nodes.


2

Database Management Screen

  1. Select the Test Database
  2. The database page is loaded, and shows:
    1. A database summary panel, showing
      1. Number of tokens (should reflect the number of customers in the database)
      2. Download tokens to download the customer ID to token mapping
      3. Last updated date (other stats coming soon)
      4. Download data template provides a blank CSV which shows the format to use for uploading
    2. Middle panel is for uploading a CSV to update or create new customer records
    3. Right side panel is a history of recent uploads. Shows how many records were uploaded, what percentage of records had complete values

3

Tokenisation process (PI data setup)

  1. Upload test file of PII records. You can download a test file here: contributor_2_pii.csv. This test file contains randomly generated synthetic data (10,000 rows plus header).
  2. A green progress bar will show the file being uploaded and processed
  3. The status panel will have a new entry, with today's date. Wait a few seconds and will update saying it processed how many records
  4. When this is done it means matcher nodes have been updated! So all the hashing, slicing and distribution is complete.

4

Download tokens & join to attributes

  1. Click Download Tokens button on left hand side
  2. Save the resulting file when prompted (you can view an example mapping file here: contributor_2_tokens.csv, but for your testing you'll need to download the real token mapping file from your Contributor Node)
  3. The file contains two columns – Contributor's original customer ID (called "personid") and their new token

This mapping file is then used to attach the token only to customer attribute data, which is then loaded into Senate. You can do this using whichever method you prefer - many custodians choose to use SQL, excel, or a scripting language to do this.


5

Upload tokenized attribute data into Senate

Upload file with tokens & attributes into Senate

  1. Login to Senate for your region and go to Manage Data screen (select Files tab)
  2. Either use SFTP (for large files) or HTTP (for smaller files up to 100MB) to upload the file into Senate
  3. This should contain a token column (with token values from your node) and then one or more "attributes"
    1. You can download an example attribute file here: contributor_2_attr.csv, just be sure to substitute the "personid" column for your "token" column

Create Database and Table

  1. Senate → Manage Data → Databases → create a new Database
  2. Create a new Table  + add columns structured to match the data that you're about to load into it (attributes & tokens)

Load data into the Table

  1. When viewing the newly created Table → click 'Load Data'
  2. Select the file you wish to load + how many header rows it contains (the rest can be left as default)
  3. Click 'Done, Load Data'
  4. Monitor the status of your data load under Manage Data → Load Jobs
    1. If everything went well, the Row count (when viewing a table) will increase to reflect the number of loaded rows

Upload file with tokens & attributes into Senate

Create Database and Table

Load data into the Table

6

Create a data package + link tokens

Create a Data Package (with tokens)

  1. Senate → Manage Data → Packages → click 'Create a new data package'
  2. Fill in all the fields + click 'Create new data package'
  3. Select the table you have created earlier on
  4. Click Link token database to tell Senate which token database generated the tokens in this package

Create a Data Package (with tokens)

Select the table you have created earlier

Click 'Link token database' to select which token database generated the tokens in this package

Specify which token database generated the tokens in this package

7

Add token packages to project (Senate setup)

  1. Go to Projects in Senate and create a new Project
    1. Add people (project participants)
    2. Add a Legal Framework (either DR's or your own, in accordance with what legals you have signed)
    3. Add data package (the one you have created earlier)
  2. Notify your Data Republic Customer Success contact, who will add their own package to the project to be matched against
  3. Alternatively, you can follow the exact same steps to create an additional Data Package and match against it
    1. You can match 2 packages belonging to the same Organization, as long as their tokens are coming from 2 unique databases on the Contributor Node
    2. NOTE: if you load the exact same data into 2 different packages and run matching between them, you should get a 100% match

Create a Senate Project

Add Project Participants

Add a Legal Framework

Add a Data Package (or 2, if you are matching against each other)

8

Data License

  1. Review the Packages tab and note that there are packages there with an "M" icon – this means the package contains tokens from Senate Matching and it may be possible to request a match
  2. Navigate to the Licenses tab to create a license:
    1. Select the Packages to be included in the License (when the 2 packages with tokens are selected, the "Include Token Matching" checkbox is enabled and should be checked if matching is needed)
    2. Fill in all the required fields
  3. Submit the license for approval (Licensee Organization, Data Contributor Organization, and DR Platform Admin - should all approve)
    1. It is possible to select in the License to not include DR's approval

9

Data License Approval

  1. Review and approve the License which also approves the token match request
  2. DR will also approve the License
  3. Once all approvals are given, you can request a match (next section)

10

Request + Execute Match

  1. Go to the 'Workspaces' tab in the project
  2. From the menu on the workspace - select Load tokens to workspace
  3. Select in the dialog box which licence and which PII field do they want to match on
  4. After selecting, the match request is sent to the Aggregator Node
  5. Match results (token pair table) are loaded into the Workspace when complete


Should the results of your match be satisfactory, then you have successfully tested the end-to-end flow of Senate matching.