Conversion Probability Prediction in Retargeting

(Provided and Sponsored BY RECOBELL)

 

Introduction: Retargeting

Retargeting (also known as retarget advertising or remarketing) is a form of online display advertising that can help advertisers to display ads to people who have previously visited their websites. It works as follows. Whenever a user visits the advertiser’s website, the user's behavior logs such as views and orders are collected with anonymized user id. If the user visits other online ad media such as websites and apps, his/her previously viewed items or their related items are shown on ad inventory (this is called impression). If the user clicks the ad, he or she is led to the advertiser’s website and may purchase items, which is called conversion.

In the field of online retargeting, retargeting ads are designed to maximize not conversions but clicks. However, the ultimate goal of advertisers is to maximize conversions with limited marketing budget. Therefore, an ad server needs to deliver retargeting ads to those who are likely to make a conversion. From this point of view, RECOBELL believes that conversion probability prediction is a core technical part of retarget advertising.

Useful Link

https://en.wikipedia.org/wiki/Behavioral_retargeting

Task Description

The task is to develop an algorithm that predicts the probability of conversion when retargeting ads are shown to users. We provide view/order logs and product metadata collected from an e-commerce website for user behavior analysis (2016/08/01 ~ 2016/10/01). We also provide training data and test data from the same website’s retargeting ad campaign. The train data contains impression logs during 2016/09/01 ~ 2016/10/01 along with two labels that indicate 1) whether ad is clicked or not, and 2) whether it leads to conversion or not. To evaluate your algorithm, we will provide as test data impression logs during 2016/10/02 ~ 2016/10/4, where each impression does not contain click or conversion label. Contest entries will develop their models to predict conversion probability for each test impression log (click probability is not required). We will evaluate their performance using logarithmic loss. The method that yields the smallest logarithmic loss will win.

Dataset

The dataset for this competition is provided by RECOBELL and FUTURESTREAM NETWORKS.

All files are gzipped, CSV format. Column information are shown below.

Retargeting Advertisement Data (provided by FUTURESTREAM NETWORKS)


FieldValid ValueNote
impression_idintAD Impression Id
impression_datetimetimestamp(yyyy-MM-dd HH:mm:ss)Timestamp of log (Timezone : KST)
uidchar(7)user id
platformchar(1)Platform of user (1 : iPhone, 2 : Android, 3: iPad)
inventory_typechar(1)Type of Inventory (A, B)
app_codevarchar(10)Media Code, where ad was exposed
os_versionvarchar(10)Version of Operating System
modelvarchar(255)Model of Mobile Phone
networkvarchar(10)Type Connectivity (3G, 4G, WIFI ...)
is_clickintWhether user clicked AD(1) or not(0)
is_conversionintWhether user buy item(s) after click this AD

Site User Behavior Data (provided by RECOBELL)

FieldValid ValueNoteView LogOrder Log
server_timetimestamp(yyyy-MM-dd HH:mm:ss.S)Timestamp of log (Timezone : KST)OO
devicechar(2)Device of user (MW : Mobile Web, MI: iPhone/iPad App, MA : Android App)OO
session_idchar(10)Browser Session ID (using Session-Cookie)OO
uidchar(7)User IDOO
item_idchar(7)Item IDOO
order_idchar(7)Order ID
O
quantityintQuantity of item
O

Site Product Meta Data (provided by RECOBELL)

FieldValid ValueNote
item_idchar(7)Product ID
priceintProduct Price (currency : KRW)
category1char(7)Category Depth 1
category2char(7)Category Depth 2
category3char(7)Category Depth 3
category4char(7)Category Depth 4
brarndchar(7)Brand ID

Evaluation Metric

Given ith impression in test data (1 <= i <= N), conversion probability prediction algorithm must produce a probability of conversion, 0 < pi < 1. Let yi be a binary variable indicating whether ith impression leads to conversion. Then, submissions are evaluated using the Logarithmic Loss (smaller is better).

An application will compare a solution file with retarget_test.csv file containing the answers to the test set and results will be presented in an online score board.

SUBMISSIONS

SCHEDULE

PRIZES (SPONSORED BY RECOBELL)

TERMS AND CONDITIONS

ORGANISING COMMITTEE

Contact

In case of any questions please send an email to Jinwoo Park at pakdd2017@recobell.com

About RECOBELL (Main Sponsor)

Please visit the Korea’s no. 1 recommendation company:

About FUTURESTREAM NETWORKS (Data Provider)

Please visit the homepage of Korea’s no. 1 mobile ad network company: