Segmentation
This example demonstrates how to optimize a neural ranking model using a segmented data format.
Segmented data (sometimes referred to as “packing”) refers to flat data arrays
where a specific segments array indicates which elements should be grouped
together to form lists. This data format is more compact than a padded approach.
Rax losses and metrics support using segmented data via the segments
keyword
argument.
Instructions
Clone the Rax repo:
git clone git@github.com:google/rax.git
cd rax
Install the example dependencies:
pip install -r requirements/requirements-examples.txt
And then run the example code:
python examples/segmentation/main.py
You should see the following expected output in the form of a JSON dictionary containing the loss and ranking metrics at different steps during training.
[
{
"step": 200,
"loss": 340.78642868041993,
"mrr": 0.887844353467226,
"ndcg": 0.7206529039144516,
"ndcg@10": 0.48478462554514407
},
{
"step": 400,
"loss": 333.58397720336916,
"mrr": 0.9479761835932732,
"ndcg": 0.8064527478814125,
"ndcg@10": 0.6433215755224228
},
{
"step": 600,
"loss": 329.4338551330566,
"mrr": 0.9637777781486512,
"ndcg": 0.8601924273371696,
"ndcg@10": 0.7409686753153801
}
]